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INTRODUCTION 



The papers collected in this volume are intended both to present summary *history' and 
description of the lEA Second International Mathematics Study and to illustrate a 
variety of approaches to the analysis of the data that emerged from SIMS. The initial 
results from SIMS have been reported in both the series of national reports outlined in 
the Appendix to this report and in the technical and published reports describing the 
results of the international analyses. However, as numerous observers of SIMS have 
comtaented and the papers in this report amply illustrate, these initial analyses-by 
members of the core working groups closely associated with the study throughout its 
now long history-have barely scratched the surface of the data that SIMS collected and 
have not explored the variety of questions and the analytical approaches and methods 
that the SIMS database can support. The papers collected here are intended to encourage 
others to explore this rich database for both national and comparative studies on the 
teaching and learning mathematics. 

Most of the papers in this report were initially presented at a seminar ^n 
Secondary Analysis of the SIMS Database held at the University of Illinois at Urbana- 
Champaign in January 1989. Subsequently we became aware of the work of David Baker 
and David Stevenson of the Catholic University of America and took advantage of their 
willingness to share their exciting research with the SIMS 'community* by incorporating 
early versions of two of their papers in the Report. In addition this report includes 
abstracts and an initial review, prepared by Leigh Burstein of the University of 
California, Los Angeles, of some of the recently-completed U.S. dissertations that have 
used the SIMS data. 

The University of Illinois at Urbana-Champaign seminar on Secondary Analysis 
of the SIMS Database and this Report are part of a larger project, the SIMS Database 
Enhancement Project, which has as its major tasks the preparation of the public-use 
database which might support further secondary analysis of the SIMS data, the training 
of researchers in the use of both the SIMS data and the database, and the encouragement 
of secondary analysis of SIMS data. The SIMS Database Enhancement Project is 
supported by a grant from the United States National Science Foundation (Grant No. 
NSFSPA87-.S1425). 

Ian Westbury 
Kenneth J. Travers 
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AN OVERVIEW OF lEA STUDY 

R A. Garden 
Director Research and Statistics 
New Zealand Department of Education 



Introduction 

In the course of the Second International Mathematics Study (SIMS) conducted 
imder the auspices of the International Association for the Evaluation of Educational 
Achievement (lEA), data was obtained from approximately 3,900 schools, 6,200 teachers, 
and 124,000 students in more than 20 education systems around the world. This 
discussion will not deal with detail about the aims or conduct of the study - these can be 
read in a number of study's publications, the set of five Bulletins for example. Nor will 
it acknowledge the contributions to the study of a large number of individuals. To the 
extent that recognition of the dedication, perserverance, and special skills contributed by 
so many can be made by a few words in a publication this also has been done elsewhere. 
My intention to give an indication of some of the influences and constraints that 
deteraiined the nature of the databank which is a major outcome of the study. It is to be 
hoped thai users of the databank who find that their favourite variable was not included 
in the study, or who discover flaws in the data, or shortcomings in the documentation, 
will think about the difficulties associated with the scale of the project, its multinational 
nature, and the miniscule levels of funding for key phases of the project-and refrain 
from rushing into print with trenchant criticisms. 

Data collection, preparation and analysis took place in the first half of this decade 
and it is now beginning to be easier to place the various SIMS actors and their actions, 
and all the activities of the study, into some sort of perspective. During the study it was 
the negative aspects which dominated our lives • the National Research Coordinators 
(NRCs) who dkl not follow instructions, the postal delays, the mistmderstandings, the 
unreadable data tapes, the mis-coded data, and so on. Now distance in time is beginning 
to lend a Jegr^ of enchantment to the view, but in order to give some shape to the 
overview it helps to acopt a suitable framework. 

There are several possibilities. The flirst which conunended itself was a 
framework based on the life-cycle metaphor, where progress wotdd be discussed in 
terms of conception of the study (perhaps there was even a seduction phase), its birth 
(not without subsequent post-natal depression), toddlerhood (poverty-stricken but filled 
with hope), adolescence (sturm und drang), maturity (great responsibility but still no 
money), and old age (now some money but too late to enjoy it). There doesn't secern to 
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have been an identifiable death scene, but some of us are certainly being haunted, and 
this meeting might weU be the platform for resurrection. A battle metaphor also 
suggested itself, with declaration, arming and training phases and so on. But to sustain 
this metaphor one would have to talk of skirmishes, conflicts, tensions between 
generals in the rear and lieutenants down the line, perhaps even some putting of bUnd 
eyes to the telescope, and of course all this would be quite inappropriate. Or would it! 

The solution to the problem of how best to map an overview emerged as I re- 
read the B -Jlelins produced throughout the study. Those familiar with the study 
documents will recognise the source of the model at once, and those who have been 
involved with the study throughout its planning, development and execution will 
acknowledge its aptness. Any overview of SIMS would have to recognise the existence 
of the study phases shown in Figure 1. There are important differences between what 
was intended and what happened, aiid between what was sought and what was 
captured. Without an awareness of the causes of these differences misinterpretation of 
results of analyses of the data is a strong possibility. 
Figure 1. 



Intended Study 



Implemented Study 



What the ISC wanted 

What National 
Centres did. 



Attained Study 



International Database 
and Documentation 
Publications Experience 



Framework for the Overview 

The shape of this overview, then, is similar to that by which SIMS participants 
came to view the curriculum. The modal for the study is, I believe, adaptable to a range 
of diverse human activities and organisational processes, but it should be noted that the 
inteipretation I will share with you is my interpretation. Other people involved in the 
study would doubtless have different interpretations. One of the unforgettable lessons 
administrators of cooperaHve intemaHonal studies learn is that amongst people from 
different social, cultural, poUtical backgrounds it is very hard to And a common 
perception of many of the things whose meanings we, as individuals, take for granted 
within our own socio-cultural environment. 
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Another foctor to be taken into consideration is that although I was involved in 
the study from foirly early in its evolution, I did not become International Coordinator 
until 1980. Roy Phillipps, the first International Coordiiiator (and at that time a member 
of the lEA Standing Committee), and Ken Travers, Chairman of the International 
Steering Committee GSC) throughout the study may not share all my perceptions. My 
view of early attempts at fundraising and initial planning for SIMS was very much a 
worm's eye view* Hg^ore 2, th^n, provides the framework for my talk 

It should also be noted that my view was from the International Centre in the 
Department of Education in Wellington, Neiv Zealand. Work related to the 
longitudinal component of SIMS, and to the construction of the relevant sections of the 
databank, was carried out at the study Centre at the University of Illinois at Urbana- 
Champaigr. with support from Richard V/olfe and others at the Ontario Institute for 
Studies in Education in Toronto. However, the challenges associated with processing 
lEA data appear to be independent of geographic location. 

Study Antecedents 

Features such as the kind of data that are in the SIMS databank, the way they are 
arranged, their characteristics and quality are, in part, the result of events which 
occurred before the study was conceived. Earlier lEA studies, the First International 
Mathematics Survey (FIMS) and the Sbc Subject Survey, had seen jxe development of a 
research design and methodology which had 'woriced*. Leading researchers from 
several countries had been involved in the cooperative deveIop**^^t and execution of 
the previous studies and now formed the nucleus of an extensive international 
conununity of comparative researches. SIMS thus b3came the latest 'baby* of the lEA 
family and inevitably manifested a strong genetic blueprint. This carried many benefits - 
and a few disadvantages - but the point to note is that the study is unmistakably an lEA 
study* 

Because this was the second lEA mathematics study there was an especially 
strong pattern in existence and, in fact, some participants initially saw the second study 
as being a replication of the first But although there was strong interest in being able to 
make comparisons over time, rapid growth in the mathematics education knowledge- 
base, an expanding set of techniques of analysis, and competing views of how research 
should be carried out quickly reduced the importance of this aim for the study. 

SIMS was intended to have a much stronger emphasis on mathematics 
education than had FIMS, where mathematics tended to be treated as a surrogate for 
school achievement in general. To this end, the ISC, advisers and National Committees 
included strong representations rf respected mathematics educators. The questions they 
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Wished the study to address were diverse, and many came from a different domain from 
those examined in prior ffiA studies. Furthermore, many of the questions could not 
easUy be addressed by traditional lEA survey methods. It is not suiprising, given 
differing expectations of the study, that from time to time 'father' lEA Standing 
Committee did not always see eye to eye with 'mother' ISC over how the SIMS "baby* 
should be reared. 

At national (or system) level, the character and history of the insHtution which 
houses the lEA National Centre partly detennine how much influenr.2 that centre will 
have on international instrrunents and manuab, as weU as on the assiduousness with 
which international instructions about data coUection and preparation are followed. 
Previous institutional experience of participation in an lEA study, or of other large scale 
survey work, can be expected to contribute to a more complete and error-free national 
dala set. For the same reasons, the experience, research abiUty, and other pereonal 
qualities of the National Research Coordinator are reflected in the study outcomes. 
However, it should be noted that in SIMS some exceUent data sets came from systems in 
which relevant experience was not great. An aWUty (and wil'ingness) to foUow 
instructions to the letter was the main pre- requisite required. 

Shidy Contexts 

Past lEA studies were an important anttCcuent to SIMS. The individuals who 
hai^ played leading roles in these studies and who occupied influential positions in lEA, 
and in the research community generally, when SIMS began constituted an important 
contextual factor. 

Their experience, knowledge, beliefs and attitudes had a conskieiable influence 
on the conduct of the study. A second contextual factor was what might be called the 
"lEA ethos". Early IBA studies were pioneering ventures which, because of the way lEA 
had come iiito being, attracted researchers for whom rewards such as the inteUectuai 
challenge, intrinsfc interest, and stimulation of partkdpating cooperatively in an 
international venture were enough. Many, perhaps most, of those who play leading 
roles in IBA studies are prepared to sacrifice a vast amount of theii" time to engage in 
very difficult tasks for which they receive Uttie, U any, financial recompense. But by the 
time SIMS was underway a number of researchers simpl)- couW not affoid this sort of 
financial sacrifice. Finding ways of ensuring that amongst the best available consultants 
and advisen were able to make key contributions to the oidy utilised a good deal of the 
energy and challenged the imagination of study administrators. 

Funding, or rather the lack of it, was undoubtedly the greatest lusndicap faced by 
those planning and executing the study. Policies of frmding agencies had changed 
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substantially since earlier lEA studies. Many were targetting funding locally, or at 
specific reserrch fields, and were just not "in the market" to provide funding for 
international research* Those few funding international research were not necessarily 
interested iii secondary school mathematics research 

For a large part of the duration of the study there was just enough support to 
allow progress to be made Maintaining the commitment of research teams to difficult 
tasks when there is no guarantee that the manuals and instruments they are producing 
will ever be used is not easy, and relying only on correspondence (in English) for 
communication with researchers firom a wide range of nations is definitely not 
recommended. Inability to fund meetings of National Research Coordinators early in 
the planning phase of the study resulted in a certain amount of *>indoing and redoing** 
of work. In general problems of this sort were solved satisfactorily, but there are one or 
two places where the repairs are obvious. A case in point is the sets of items added late 
to the cognitive instruments in an attmpt to meet criticisms that the curricula of 
certain European systems were not ade^tely represented. 

Despite the bleakness of the general funding picture at the start of the study, there 
were people in some of the funding agencies who seitsed the potential of the study and 
were able to furnish assistance As the likely outcomes of the study became more 
apparent further funding was able to be obtained, but even at this stage this was a far 
from easy task for lEA advocates within the agencies. Those associated with SIMS are 
very grateful for the good work they did. Some of these agencies made fiinher 
substantial contributions to the study through the expe:tise of professionals on their 
staffo or through the expertise of researchers they recommended and encouraged to 
assist with aspects of the study. 

At national level, the leveb of funding and r^urces available were also of 
cnidal importance. In many National Centres researchers carried out their duties as 
National Research Coordinator in addition to a subotantial workload from national or 
local projects. Furthermore, it was common for them to have to g^ve less priority to 
SIMS than to other projects they were woridng oru In the foce of difficulties of this sort 
the NRCs did a remarkable job. It is not siuprising that there were, from time to time, 
problems in meeting deadlines or in meeting spedficections in the provision of data. 

The quality of lEA studies depends as much on the NRCs as on any other group. 
It is they who are responsible for translating Ihcr \/ishes of the ISC into action in cultures 
and ruiiioiuti nUlieux quite different ftY)m those of people who exerted the major 
influences on design of the study. In the phase in which instruments and manuals are 
being negotiated they must convey the spirit and intent of the study to their natioiuil 
conunlttee members, and represent the wishes of their national comnuttees to the ISC. 
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The success with which SIMS NRCs could meet these demands depended on their 
having credibiUty with mathematics teachers as well as with the research community. 
The management chidlenges m an lEA study are as significant as the research 
challenges - and their importance is sometimes overlooked. 

We need to remember then, as we examine and manipulate lEA data, that the 
contocto in which the study was designed and executed had a lot to do with the nature of 
study outcomes. Those carrying out secondary analysis would be weD advised to bear 
this in mind and to make a real eJffbrt to "get a feel" for the contextual factors in those 
com jles for which interpretation of analyses will be made. 

The Intended Study 

Nothing is ever simple in large-scale studies where design and methods are 
negotiated by partidpanto from diverse cultural backgromids. To refer to SIMS as a 
single study disguises the scope and complexity of what ^vas really a coUection of studies. 
To begin with there wr-e two target populations: 

Population A: All students in the grade (year level) where the majority 
have attained the age of 13.00 to 13.11 y lan by the middle of the school year. 

Population B: All students who were in the normally accepted terminal 
grade of the secondaty educaHcn system and who were studying mathematics as 
a substantial part of their academic programme. 

At each grade level systems had the choice of administering the fuU shidy (i.e. a 
longitudinal study which inchided pretest and post-test and coUection of cbssroom 
process and teacher behaviour data) or doing a "reduced" study (aoss-sectional with 
post-test as the only cognitive measure and without classroom process data.) The 
participating systems, with the populations tested and the version of the study are 
shown in Table 1. Ontario, British Columbia and the United States were the only 
systems to undertake a longitudinal study for Population B. 
Table 1 Participating Systems 
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System 


Population(s) 


Study* 


System 


Population(s) 


Study 






(Pop A) 






(Pop A) 


Belgium (Flemisl 


) A&B 


L 


Luxembourg 


A 


C 


Belgium (French) 


A&B 


C 


The Netherlanc 


s A 


C 


British Columbia 


A&B 


L 


New Zealand 


A&B 


L 


England & Wales 


A&B 


C 


Nigeria 


A 


C 


Finland 


A&B 


C 


Ontario 


A&B 


L 


I^ance 


A 


L 


Scotland 


A&B 


C 


Hong Kong 


A&B 


C 


Swaziland 


A 


c 


Hungary 


A&B 


C 


Sweden 


A&B 


c 


Israel 


A&B 


C 


Thailand 


A&B 


L 


Japan 


A&B 


L 


USA 


A&B 


L 



"^Longitudinal, Cross-sectional 



The variations resulted in large part from the outcomes that luitional centres r»aw 
as having most value for their systems at that time. For many, a comparison of 
mathematics achievement between their system and other systems was the most 
important outcome desired. Half of the systems were interest' ^ obtaining an 
indication of whether mean student performance in their sy tern nad improved or 
declined since FIM5. The group of systems comparisons had as their prime interest 
identification of variables that could be manipulated to improve mathematics 
achievement. 

SIMS was intended by the ISC to differ from FIMS in another important way. As 
well as the introduction of the longitudinal study, there was a thorough analysis of the 
curricula of participating systems. This was seen not only as having intrinsic interest, 
but also as a way of illumiitating the results of the cognitive tests. 

Notwithstanding the variations built into the design, the "intended &tudy" as far 
as the ISC were concerned involved all systems completing all tasks and instruments in 
the components of the study they had ele^ ' ^ to participate in, following the detailed 
instructions in memoranda and manuals to the letter, and hence producing flawless 
data sets or, at worst, data sets with all deviations and omissions carefully documented* 
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The Implemented Study 

Probably no national centre administered the study exactly as intended by the ISC. 
Nor could it reaUsticany be expected that they would. Each member institution operates 
under its o%vn constraints and within its own national culture. 

Different National Committees develop di^rent aspects of EEA study's as 
priority areas and, if resources are scarce, may delete low priority questions or 
instruments. h» a few cases questionnaire items were changed, or mistranslated. In a 
few cases some questionnaire responses were precoded (e.g. Language of the Home was 
assumed to be Japanese for aU students in Japan, and in a couple of systems periods 
lengths and days in tiie school year were assumed to be constant across schools). 

Arriving at definitions of trrget populations and constructing sampling manuals 
which can be implemented in an identical fashion and which have the same effects and 
results in all systems is just not possible But the sampling manuals used in SIMS were 
based on the experien.» of past lEA studies combined with tiie v/isdom of acknowledged 
experts in sampling. A process which involved NRCs in comment, negotiation with 
tiie SIMS Sampling Committee, and approval of sampling plans by a sampling referee 
was designed to minimise sampling errors and to make outcome measures comparable 
across systems. National samples which fell short of enabling tiiese ideals to be fully 
attained did so for a variety of reasons as documented in tiie Sampling Report for the 
study. But as I asserted in tiiat report, even for tiie least satisfactory samples, enough is 
kno-»vn about tf»em for some important conclusions to be drawn with reasonable 
confideiKfc However, tiiere are data for variables in some systems which should be 
interpreted witf» great caution and are better not faKluded in multi-variate analyses. 

The data collection phase was generally weU executed. Where response rates 
were not as good as were hoped for tius was not tiuough inadequacies in the manuals or 
otiier advice sent from tf»e International Ccnh-e in Wellington, nor was it tiirough lack 
of diUgence on tiie part of NRCs. In tiie worst case tiie Nigerian NRC was unable to get 
to several provinces because of extensive flooding, so tiie population definition for 
Nigeria was changed to take account of this. 

The Ipast weU executed part of tfie study at system level was in tfie preparation of 
data for shipping to the International centre. With tiie wisdom of hindsight it is now 
clear that in some cenh«s ttiis resulted from lack of experience in handling datasets of 
the size of those in SIMS, Le., several tiiousand cases with several lengtiiy records for 
each casft Insufficient clerical provision had been made for checking and coding and 
several NRCs experienced weeks cf tedious work. Tliere was also a considerable range 
of expertise amongst NRCs in computer-related data preparation (but it should not be 
assumed tfiat poor tapes were received only from less experienced, or good tapes only 



ftv>in more experiencec' natioxud centres* Experience helps, but the ability to follow 
instructions was just as cradal)* The outcome was that the data received at the 
International Centre from national centres posed a ;ieries of challenges. 

The first of these challenges was to read the tape - not always possible for the first 
tape received. The second was to decipher what was on the tape. The third was to relate 
what was on the tape to what was expected to be on the tape. The fourth was to caress 
(or sometimes bash) the data set so as to get it into the required format remove out of 
range values and impossible outliers, without imnecassarily losing one "good** data 
element. 

Working through these stages was a lr»ng process. Where data could not be read 
from the tape, the national centre had to be asked for a new tape. Even when tapes were 
read, new tapes had to be asked for in some cases, but this was not common because 
great efiorts were made to get the data into shape at the International Centre. This was 
judged to be likely to take less time than sending the £aulty tape back to the national 
centre and waiting for it to be dealt with, especially as national centres tended to have 
used all their SIMS funding by that stage. When the data was readable and correctly 
formatted in was checked for out-of-range data, outliers, accuracy of details of 
modifications and deletions supplied by each luitional centre, and any unexplained 
anomalies. The reasons for these were often obvious aiKl appropriate editing could be 
done at the International Centre. Other anomalies were able to be corrected at the 
International Centre after considerable detective work. NRCs were sent frequency 
outputs for each item in their data sets and asked to check them. They were asked to 
check and approve changes made at the International Centre and to explain any 
anomalies which the Intematioiud Centre had been unable to resolve. 

There could thus be several exchanges of correspondence between the 
International Centre and a natioiuu centre (especially as some NRCs did not respond to 
correspondence for some time). If the study were being conducted now, with the 
availability of E-mail and Fax, this process woukl be drastically shortened. As it was, it 
all took a long time. The alternative was, in my view, the loss of a great deal of data, and 
possibly dropping some sy&tems from the study. 

The Attained Shidy 

What do UMS veterans have to show for their work up to now? Already there is 
a substantial Ust of publications associated with the study. Two of the three volumes of 
the international report are available and the third will be published in the near future. 
Other substantial publications for an international audience are planned. 
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Articles have appeared in internaliciia^ and national journals, and short research 
briefs have been prepared for national audiences. But perhaps the greatest impact of the 
shidy has been made through the national reports produced by itational centres. It is 
these, written from the perspective of the participating system for a home audience, that 
elicited responses from politicians, educational adninisb^tore,- matiiematics educators 
and teachers who were made aware of a need to, or a way of, improving mathematics 
education in their schools. 

Not to be underestimated either, are the long term effects on a system's 
mathematics education and research communities of having participated in SIMS. 
Researchers learned new techniques and refined old ones, matiiematics educators were 
introduced to new ways of looking at b'.eir field, and teachers in many of tiie systems in 
the class-room processes component of tiie shidy remarked that participation had been 
excellent in- service b-aining in mathematics education. 

But the most important outcome of JIMS could yet prove to be the resource 
whicii vviU be under discussion during tiie next few days. Although there has been a 
considerable amount published from SIMS data, tiie surface has scarcely been scratched. 
There can be few databanks as extensive and as complex which have had tiie same 
amouv.t of careful work put into them to keep tiie data as complete as possible, to 
provide extensive explanatory documentation, and to make the data accessible, as this 
one. 



Discussion 

This narrative does not amount to much more than a rather sketchy outline of 
what was a decade's endeavour involving many people. It fbcusses on those feahires of 
the study that led to its most noticeable outcomes. Another overview might have i/aced 
the changes which took place iii the intended outcomes and shidy procedur3s as more 
systenw committed themselves (late) to participation, or as understanding of tiie 
standpoints of already participating systems grew. Changes in emphases within 
mathematics education, and education generally, abo gave rise to new emphases as 
planning progressed For example, early in tiie planning tiie use of calculators in 
mathematics, appUcations in mathematics, and minimal competency policies were 
projected as being major feahires of the proposed study. These topics eventually receded 
into the background, but ERIC publications featuring discussions of the then current 
views and activities in these areas from each of a wide range of countries were among 
the important, but less visible outcomes of the study. 

. The real guide to the degree of success achieved by SIMS will Ue in answers to the 
questions: Did the audiences which the ISC targetted "receive" useful messages from 
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SIMS? Has approprial. action resulted? Does the SIMS experience add to our 
knowledge about research/mathematics education and suggest lines of inquiry for 
future study? 

For the first pair of questions the answer would be a qualified yes. We know that 
SIMS has led to action designed to improve mathematics achievement from 
policymakers, mathematics educators and teachers m some of the participating systems. 
The qualification is because we do not have information from every system about the 
impact of their national report, and because when action is taken it is asually not on the 
scale that the research suggests is needed. Educational administrators, and often 
teachers, tend to impart a "regression" eff%t towards the status quo. 

The answer to the final ques!5oi if an unqualified yes even at this relatively 
early stato and it is certain that knowledge about mathematics teaching and learning, 
and about research into these, will inaease If the dacabank is widely utilised. One would 
hope that alternative models, both of learning and of analysis, will be tested. There is 
scope for rethinking of wMch ai*e the key variables and how these variables might be 
constmcted. It would be of interest to replace adiievement as the dependent variable by 
Implemented Coverage, or Teacher Expectation of Success. Replacing mean class 
achievement with percent of class reactdng a given mastery level as dependent variable 
might also lead to some interesting results. The possibibties are almost limitless. 

Another field which might l>e explore via the SIMS data is that of educational 
hidica^ >rs. Many education systen\s throughout the world are seeking measures of the 
"health" of their systems. A large OECD exercise is currently underway in this field. 
Indices constructed from coverage (opportunfty-to-leam) provide measures of 
"confonnity" (between what was expected of teachers and what they did in teaching 
mathematics), and of "efficiency". Other lEA variables, 5deld for instance, suggest 
themselves as indicators. Supplementing lEA data with up-to-date financial 
information would give rise to a further set of indicators. 

Conclusion 

Akeady there is talk of a Third International Mathematics Study, (which 
demonstrates again the healing power of time). A very few years ago the mere 
suggestion of going through t all again would have brought on nightmares. But there 
shoidd be a TIMS. 

Shortcomings in this sort of study are inevitable, but important difficulties 
exp(^rienced with SIMS should now be able to be minimised. We know which 
procedures work and which do not, which national centres need extra support and the 
kinds of support they need. We know which variables worked, how to improve the 
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measurement of some of those which did not do so well, and which variables were 
missing. 

Improvements in communication (E-maU and Fax) would shorten a third study 
by yeius, assuming reasonable levels of funding. Without funding adequate to provide 
for cooperative planning and design of the study with NRCs, execute the study, and 
provide for production of international reports, all bets would be off. I would not like to 
see the general nature of lEA studies change, bi't if we cannot have executives working 
full-time on the TIMS to ensure that it runs to schedule it wouW be better to abandon 
the fieU to the "fast test" experts. (Reading the SIMS BuUetins wiU reveal that the study 
schedule was a systematic variable with substantial variance. Arguments that, like fine 
wine, the SIMS data woukl improve with age did not win approval) 

Perhaps lEA should update its fund winning methods. SIMS could just as easUy 
have stood for the Steinlager International Mathematics Study. New Zealand's results 
could weU have driven mathematics teachers to seek solace in that fine product. 

In aim jst aU substantial research projects it can be claimed that the data is grossly 
under-exploited. Major efforts have been made to pieserve the SIMS data in a form in 
which it is readily accessible and interpretable to researchers for further analysis. AU of 
those who were involved in the SIMS enterprise will be delighted that the data wUl 
continue to be used towards the improvement of mathematics teaching and learning. 
Every effort must be made to see that researchers in many countries make full use of 
this resource. 
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The Studies 

Edward Kifer Richard Wolfe 

University of Kentucky Ontario Institute for Studies in Education 

Introdttctipn 

This chapter contains descriptioits of a few major differences among the 
countries that took part in the longitudinal portion of the Second lEA Mathematics 
Sttidy. The first section gives size, population aiui geographic information. The second 
provkles a brief sumxxuuy of the structure of the school systems. Third, there is a brief 
S)mopsis of the characteristics of the study corulucted in each of the cotmtries. finally, 
there is a general overview of the curricula of the countries. This and the second 
chapter are meant to set a context in which the results can be interpreted and responded 
to. 

The Cowntritt 

This volume focusses on teachers, students and classrooms and how students 
change during a year of schooling. The international nature of the study, however, 
serves as a constant reminder that education is essentially a social and cultural 
phenomena. Students do change while in schools and some of that change is because 
they are in schools. But all of what they learn is embedded in a context defined by 
differeiKes in values, geography, wealth, tradition and any of a variety of variables that 
can be summed up rather ^ily. These are different countries. 

There is a story^ told by an Axistralian journalist about the Japanese sense of 
**wah** (cooperation, harmony and lialance) and how it pervades virtually every facet of 
social and economic life. While living outsic e of Tokyo, he engages in typical activities, 
one of which is buying gasoline for his automobile. The station he frequents charges a 
few more yen per liter than one slightly forther away from his home. He is, of course, 
free to change where he buys gasoline. Should he do that, however, the owner who 
now has the joumalist*s business has an obligation (wah) to the previous owner to 
compensate him for the loss of a customer. The amount and type of compensation is 
detennined through long and involved bargaining within a context of unwritten but 
complicated rules. The jounudist, on the other hand, has an interest in staying with the 
first owner since it is also understood that should demand evaporate for his stories 
about Japan, gasoline would be available and a tab kept for any reasonable amount of 
time. After all, ovming a gasoline station is much more (wah) than merely selling 
gasoline to make money. 
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It would be possible, one imagines, to give anecdotes such as this for each of the 
countries in the survey. Each maintains traditions and ways of operating that would be 
"foreign" to others. Tadt understandings, language differences, customs, traditions and 
other "cultural" variables are not measured in this survey. But there is no doubt that 
they, just as some of the background conditions described below, influence profoundly 
the experiences of children in schools. 

Figure 1 shows in terms of population and area2 other differences between these 
countries. The USA, for example, is an area 300 times larger than Belgium Flemish and 
in population 100 times larger than British Columbia. The population density of 
Belgium Flemish and Japan are 100 times greater than that of British Columbia. 
Though small, Belgium Flemish is very heavily populated; though relatively small, 
New Zealand is rather sparsely populated. How and to what extent these factors 
influence educational processes are matters for healthy speculation. To posit that such 
factors do not influence direcUy or indirectiy schools and schooling would be foUy 

Figure 2 contains demographic and educational factors3 tiiat differ between these 
countries. What is estimated to be the wealthiest country, British Columbia, has a per 
capita income 200 times larger than that of Thailand, the poorest country. One hundred 
percent of the students in Japan responded that the language of the school and their 
home is always the same while only 16% of students in Belgium Remish gave that 
response. Students in the United Spates are exposed to 50% more mathematics 
instruction in a school year tiian are students in Jepan. StiU, the estimated 150 hours per 
year for the USA represents at best 15% of the time that students spend in school and, if 
one imagined an intensive mathematics course that lasted 6 hours per day seven days 
per week, it would be only about 31/2 weeks of the year that they were in a setting 
where mathematics is taught. It is a small part of a child's life that is devoted to 
receiving nuithematics insfruction. 

Enrollment figures for the eight countries show equally dramatic differences. 
Since 1965, about the time of the first lEA mathematics study, the seven developed 
countries have had rather stable and sometimes decUning total enrollments at the 
primary levels of schooling. Thailand during that period increased its primary school 
enroUjient by almost 3 miUion students or an increase of almof: 75%.4 Its lower 
secondary schools inaeased sfac-fold, from a 1965 total of about 250,000 to a 1980 total of 
1,500,000. The developed countiies found another way to expand schooUng-make more 
of it almost mandatory. France in those 15 years changed from a univereity enrollment 
of 400,000 to 1,000X)00. Japan almost tripled the number of students in higher education 
from a 1965 total of 800,000. Both community college and university enrollments 
expanded rapidly in British Columbia and Ontario. So there has been expansion of 
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schooling in all countries. The major differencef. is at what level of a school system the 
enrollments grew. 

Despite those rather obvious (licences, the countries apparently share similar 
views of the power and importance of schooling. Childrer begin to go to school in each 
country ai either age 5 or 6 and end no earlier than age 15 or 16. A trend appears, 
however, to be in the direction of extending both the duration and universality of 
expected time in schools. A 1974 reform act in Belgium Flemish made at l^ast halftime 
attendance compulsory until age 18 while students in Ontario can attend publicly 
supported schoob at age 4. Formal schooling (from cradle to grave?) is expanding, with 
the developed countries increasing partiu^^ation in schooling beyond the secondary 
level and the developing ones making primary and secondary education universal 
The Structure of the Schoob 

Those who are famtiiliar with U*S. schook know how difficult it is to describe how 
they are organized The sample, for instance, of students in this study comes from the 
eighth grade Is that the end of elementary school, the second year of junior high school 
or the end of a middle school that separates elementary and secondary schooling? Other 
countries have equally ambiguous organizational structures so what follows is a 
description of typical patterns rather than a presentation of these school organizations in 
all of their complexity. 

Belgium Flemish 

Fre*^ools are available to chiklren ages 21/2 to 6. Primary education is 
compulsory from ages six to twelve after which students enroll in lower secondary 
school There are two types of curriculum offered in these schools, one called common 
general and the other vocational An upper secondary school is available from ages 15 
to 18 with halftime attendance required from ages 16 to 18. There are several different 
types of organizational authority for schoob. They include private, usually catholic, 
schools, provincial schools^ state schools and communal schools. The sample of 
students upon which the analyses for this volume is l>ased comes from the lower 
secondary school 

Pritiah Columbta 

For both Canadian provinces th^ stucture, financing and control of the schools 
are independent of the imtional government. Children in British Columbia have 
opportunities to attend pre-schools and kiiKlergartens prior to age 6 at which time there 
is compulsory enrollment in a 6 year primary school. For ages 12 to 18 there are lower 
secondary and upyer secondary schoob. Differentiati^ n of curriculum occurs during the 
upper secondary schoob and all children are exposed to common activities prior to then. 



16 



The sample of students comes from grade 8 which is located in the lower secondary 
school 

France 

Schooling in France is considered to be highly centralized. Children of ages 2 to 5 
may attend pre-prlmary schools and 1980 estunates are that about 90% of them do. 
Primary school extends from ages 6 to 10 and grade repetition, though declining 
somewhat, remains relatively common compared with other countries. The Brst cycle 
of secondary schooling Is for ages 11 to 14 and contains a common curriculum except 
that after two years students may pursue more vocrtionally oriented courses. The 
second cycle of secondary schoo*lng leads either to a baccalaureate degree and 
preparation for university or a vocational certificate. About 25% of the Populatioa A 
students attend private schoob. Students ftt)m the first cycle are in thb sample 

lapen 

Japanese children attend pre*schools and kindergar^-^ns from ag<^ 3 to 6. 
elementary school ftom 6 to 12, lower secondary from 12 to 15 and upper secondary from 
15 to 18. Upper secondary schoob provide a variety of alternatives including vocational, 
university preparation and corresponder^ce courses. Examinations after lower secondary 
school determine what upper secondary school a child attends. The sample of children 
in this study comes fit)m the 1st year of lower secondary school and is coi.iparable to 
grade 7 in th^ United States 

New Zealand 

Children may attend pre-school or local play centers from ages 3 to 5. Primary 
schooling begins on the child's 6th birthday and continues for 8 years. There arp up to 5 
years of secondary scitooling available but students may leave earlier to pursues 
vocations. About 10% of the schoob are private; about 30% of the schoob either for boys 
or for girb. Students from Form 3, a kind of intermediate level between primory and 
secorkbry schoob are the sampled popubtion. 

Ontario 

Children may attend public supported schoob as early as age 4. Primary 
schooling of a^out 6 years b followed by 6 years of secondary schooling which, in 
addition, contains uxtiquoly a capstone grade 13. Ontario has both private schoob and 
schoob where the language of instruction b French. The sampled population came 
from grade 8, a part of the secondary school. 

Thailand 

Local conmtunities provide what pre*schooling b available for students prior to 
the age of 6. Primary schooling extends from age 6 to age 12; lower secondary from age 
12 to age 15; and, upper secondary from ages 15 to 18. 5:*iOob are financed by the 
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national govenunent and as indicated earlier there has been a dramatic increase in 
enrollments at all levels and especially at the secondary level The sample of students 
comes from the lower secondary schools. 

United States 

A variety of pre-schooling opportunities, both private and public are available for 
children under the age of 6. In nost states publicly funded kindergartens are available at 
age 5. From age 6 to about age 14 students attend elementary schools but the particular 
structure depends on the local school system. Secondary education through 
comprehensive schools that provide either college preparatory or vocation curricula are 
available for students of age 14 to 18. The sample of students for the study comes from 
the eighth grade About 10% of the students are enrolled in private schools. 

Differences in school organization may simply represent various ways to slice the 
same loaf of bread. That is, there may be few important consequences of having an 
eighth grade in a junior high versus an elementary of middle school. Yet, it is 
interesting to note that in the majority of countries the sample of students comes from 
what is pe ceived to be secondary schooling as opposed to the United States where 
studen^^ perceptually have not yet entered secondary schools. 
Control of Schools 

As ambiguous as school organization but more important may be the issue of 
control of schooling. Here, at least super&nally, differences are only between the United 
States and the others. In the US. there remains, at the rhetorical level, the notion that 
control of sch;ols resides in the local conmiunity. While this may be correct 
historically, there is little doubt thai recent educational reforms have had an intended or 
unintended effect of diminishing local autonomy and placing more control at the state 
level. This change in locus comes on top of earlier efforts of the Federal Government to 
institute programs where in effect schools had to play by federal rules in order to qualify 
for federal monies. So a question for the US. is whether or not there is local control of 
schools. 

Other countries presumably have central control of schools. For the two 
Canadian provinces central control case means provincial control. For the remainder of 
the countries it means t?iat educational policies are made at a national level. Just as one 
can question the validity of the notion of local control for the U.S. so too it is possible to 
wonder what central control means for the other countries. Questions here reside 
around the notion of what really can be controlled. For example, no system has the 
power to control completely what teachers do in classrooms, but can to a greater extent 
control who becomes a teacher. Likewise, it is possible to define at a national level the 
nature of a curriculum, but it is impossible to insure that it is implemented consistently 
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across classrooms* M will be seen later in this volume, teachers within all countries 
vary greatly in regard to what part of the curriculiun they say they teach. In addiHon, 
most countries in the survey make provisions for local or provincial initiatives on 
virtually all matter related to schools. So, what one has generally is tension among 
various administrative levek regardless of where formal or legal control resides. 

Issues of types of control over schools and how much control is possible, though 
complex, are crudaL Within those realms are potentially important explanatory 
variables. Countries use inspectors, competence examinations, school leaving 
examinations, financial threats, legal authority, teachers' unions and a multitude of 
other means to influence outcomes of schooling. Yet, prudence dictates that with 
limited evidence one should mention differences but not attempt to resolve them. 
Hence, for this volume they remain important but unresolved issues and the 
relationships between them and outcomes of schooling are i tmexamined. 

Characteristics of the S adies 

Cotmtries which decide to participate in lEA studies decide what parts of the 
larger study they will implement For the mathematics study additional choices were 
allowed in terms of how the agreed upon cognitive items would be admix istered. In 
addition, for funding and other administrative reasons some coimtries completed their 
study a year earlier than others. For these and numerour other reasons there arc 
variations in the studies conducted by these eight coimtries. 

The Samples 

The formal definition of the students in Population A was: All students in the 
grade (year level) where the majority has attained the age of 13.00 to 1 3 ,11 years bv the 
middle of the school year As indicated earlier, this populaUon fell in different levels of 
the school system depending on the structme of the schools within a country. The 
comparability of the samples, therefore, resides in the age of those who were <:ampled. 

A second kind of sampling, item sampling, was conducted in the studies. 
Essentially, a sampled student within a country was administered a Core test of 40 items 
and one of four rotated forms of 35 items. So although any one student might take no 
more than 75 items, responses within ^ countiy would be obtained from a full set of 180 
items. The pattern of items within the core and rotated forms as well as the decision of 
which items to administer during the pretest were left to the countries. 

Both kinds of information, that of the sampling of students and the organization 
of the cognitive test, is given below for each of the countries. 
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Belgium FTeiPigh 

1. The sample: All students in the second year of the general secondary 
education, technical secondary education and vocational secondary education programs 
in both Type I and Type n forms of school organization. Less than 1% of the student 
cohort was excluded by this definition. The sample was composed of 168 schools, 175 
classrooms and 4519 students. 

2 Cognitive test: The longitudinal core was adjusted to the Belgium Flemish 
curriculum. Both Core and rotated forms were administered at the pretest and the 
posttest with complete rotation between the two occasions. Some linkage between 
student scores at pre and posttest times have been lost 

British Columbia 

1. The sample: All students enrolled in regular grade 8 classes in September 
1980 in the British Coltunbia public school system. Both slower students in remedial 
classes and students attending private schools, about 10%, of the cohort was excluded 
from the sample. The sample was composed of 90 schools, 93 classrooms and 2567 
students. 

Z Cognitive test: A standard (i.e., the same as 5 of the other countries) 
longitudinal core administered both at pretest and posttest. The rotated forms were 
given only at the posttest 

Qnians 

1. The sample: Students enrolled in normal grade 8 classrooms in Ontario. The 
excluded population was less than 2%. The sample included 130 schools, 197 classrooms 
(two classrooms per school where possible) and 6284 students. 

1 Cognitive test The standard longitudinal core and rotated forms tests were 
administered both at pre and posttest. There was complete rotation of the forms 
between pre and posttest 

France 

1. The sample: All students in class de 4e (grade 8) of colleges, private and 
public education in metropolitan France. The excluded population is estimated to be 
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less than 1%, The sample wa» composed of 184 schools, 365 classrooms (2 per school) 
and 8778 students. 

1 Cognitive test The longitudinal core is adjusted to the French curriculum. It 
was administered along with the rotated forms both at pre and posttest. Students took 
the same rotated form on both occasions. 

Tapan 

t The sample: Students in grade 1 of lower secondary school (U.S. grade 7 
equivalent). Excluded were students in private schools or schools for the handicapped. 
About 3% of the cohort attends private schools and about 1% schools for the 
handicapped. The sample was 210 schools, 211 classrooms and 7785 students. 

1 Cognitive test: A distinct item set. There was a special 60 item test at the 
pretest and then Core and rotated forms at the posttest. 

New Zealand 

1. The sample: All studenU who are in normal classes in Form 3. The excluded 
population was less than 1%. The sample was of 100 schools, 196 classrooms (2 per 
school) and 5978 students. 

2. Cognitive test: The standard longitudinal Core and rotated forms were 
administered both at pre and postiest with complete rotation between the two occasions. 

Thailand 

1. The sample: All students in normal classes in grade 8 in all 71 provinces. 
There was no excluded population although only 85% of the cohort attends school at 
this level The sample was 99 schools, 99 clasarooms and 4030 students. 

1 Cognitive Test: The standard longitudinal Core and rotated forms were 
administered both at pre and posttest. Students received different rotated forms on the 
two occasions with no repetition of forms. 

Vnitwi States 

1. The sample: All students in the eighth grade of mainstream public and non- 
pi^Wic schools. Excluded were students with disabilities sufficiently severe to require 
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Special education classes. The sample was composed of 161 schools, 302 classrooxm (2 
per school) and 8372 students. 

2. Cognitive Test: The standard longitudinal core and rotated forms were 
administered both at pre and posttests With complete rotation between the two 
occasions. 

While these are major differences between countries in temts of sampling and 
types of testing, other variables also may be excluded from one or more countries. 
Student backgrotmd questionnaires, teacher questionnaires, students perceptions of the 
fit of the test, use of calculators and other measures are, for the most part, present aaoss 
the countries but some contain variations. Where tha^ occurs it will be doaimented in 
the text. 

The Curriculum 

An lEA volume has been devot<Hl to an investigation of the oirr :ulum of all 
countries in the mathematics survey. Here a couple of instances will be used to 
highlight some major differences between countries. For this purpose, there are two big 
questions that one can ask about a country's curriculum: What is in it and which 
students get it? 

As a partial answer to what is in the curriculum, national committees in the 
variOu^s countries were asked to rate items on the achievement test to determine 
whether or not they were appropriate. Those ratings provide a way to shew how varied 
the curricula are in these eight countries. Table 3 contains the text for eight selected 
*'«:m^< and countries' responses to those items. 

Patterns of responses across items and countries sug);est that there are very 
different auricula despite the £act that the study deals with mathematics, a content area 
where it is assumed that there is so much in common. The square root item. Oil, is 
inappropriate in both Japan and Nlw Zealand but at least acceptable in the other 
countries. The two geometry items, 022 and 096, form an interesting contrast since they 
tend to be linked, either both acceptable or not acceptable, in Belgium Flemish, Ontario 
and Thailand. For the other five countries the curricula apparentiy includes materials 
related to one of the items but not the other. Item 26 could be considered eUher a 
geometry item (similar triangles) or a ratio and proportion item. Yet, it is not acceptable 
in either Belgium Remish or France but fine elsewhere. The reasoning item, 114, is 
taught in Japan, and New Zealand as well as in Belgium Flemish and France but not in 
the other fbiur countries. The item which is most generally acceptable is a probability 
item. Interestii^gly enough probability and statistics as a content area is the least 



31 



22 

Selected Itc:ns 
Item Text 

on What is the square root of 12 * 75 
022 AB, CD, AD, EF are interMCting straight lines as shown 
026 On level ground/ a boy 5 ui Its tall caste a shadow 3 units 
044 Thereare35studenteinacla8S. 1/5 of them come to school- 
OSS For the table shown, a formula that could relate M and N is 
096 The triangle ABC and Triangle A'B'C are congruent and their.. 
114 The first error, if any, in this reasoning occurs in... 
188 The picture shows some black and some white marbles. Of all.. 
(Add pictures of itemsl) 



Rating 





BFL 


BC 


FRA 


JPN 


NZ 


ONT 


THA 


USA 


Oil 
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0 
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1 
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044 


2 
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2 
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OSS 


1 
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0 
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096 


0 


1 


2 


0 


1 


2 


0 


0 


14 


2 


0 


1 


1 


1 


0 


0 


0 


188 


2 


2 


2 


2 


2 


2 


2 


2 



2 = Highly Appropriate 1 = Acceptable 0 = Inappropriate 

Table 3. Individual countries' appropriateness ratings for selected items on the 
cognitive test 
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represented in both the curricula of the countries and the cognitive test. Apparently, 
even though countries do not emphasize statistics^ they do agree on what little of it 
should be taught 

The answer to the second question, which students are exposed to what 
curriculum, is straight-forward on the surfoce of things. For the majority of the 
countries, there is only one "official" curriculum for all students so the answer is that 
almost everyone gets the same things. Officially, there are exceptions in Belgium 
Flemish and the United States. Unofficially, as will be seen later in the volume there 
may be other exceptions. For Belgium Flemish there are two types of mathematics 
classes, one which is taught in the geneml curriculum and one that is taught in the 
vocational curriculum. Often these are in separate schools. In the United States, there is 
no official national curriculum, usually no official state curriculum but almost always 
different types of courses for different students within local school districts. Generally 
the courses are: 1) Remedial, 2) General, 3) Pre-algebra and 4) Algebra. Students typically 
are tracked into the various courses according to perceptions of prior achievement. This 
differentiation of the curriculum in the United States is the basis for a chapter later in 
the volume. 

The International Curriculum Analysis volume gives detailed descriptions of the 
curricula of each of these countries. The content of the ctirriculum, how it is delivered 
and who gets it so influences what students have an opportunity to learn and do learn 
that it would be difficult to imderestimate how important they are as explanatory 
variables for differences in achievements. 

Conclusion 

This paper is meant to provide general information about the countries. The 
aim is to remind the reader that there are major differences between the countries in 
very important ways. The analysis and interpretation of survey data is, therefore, a 
matter of taking things out of broader and richer contexts. With survey methodology 
there is no alternative to such a strategy. There is merely the necessity of reminding 
ourselves from whence the data came. 
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Footnotes 



^Sales, Munay. New York Review of Books. April 23, 1985. 
^Encyclopedia Brilanica. ISthFdition. Chicago, Illinois. 
3The International Mathenutics Curriculum. Second lEA Mathematics Study. 
^Husen, T. & Neville Posithewaite. International Encyclopedia of Education. 
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Historically, the cross-natftonal studies conducted by the International Association 
for the Evaluation of Educational Achievement (lEA) hftve generated data bases that 
have fostered considerable interest over and beyond the^r use in the primary study 
reports. The broad range of questions that have been included in lEA survey and test 
instruments, the large, multilevel school-based samples, and the mixture or 
participating systems attract researchers and policy analysts with a wide array of interests. 
The various compendiums of lEA-linked bibliographies are testament to the fact that 
virtually any relevant topic which did not appear in the origiital j^tudy reports becomes 
the subject of secondary investigations by someone somewhere. 

While unique among lEA studies in many respects, the Second International 
Mathematics Study (SIMS) is clearly attracting the secondary analysis interest of 
traditional lEA enthusiasts. Moreover, there have been inroads to new constituencies - 
mathematics educators, analysts interested in indicator development, state educational 
officials, etc. If nothing else, this conference is a clear testament to the breadth of 
continuing interest in SIMS and SIMS related research. 

My role goes beyond extolling SIMS virtues and characterizing the world of 
secondary analysis according to SIMS. My intent is to describe SIMS as a data resource for 
doctoral dissertations. At institutions where members of the SIMS curriculum and 
tedmical paneb reside, there have already been a number of doctoral theses completed 
and others are in progress. The topics represented span a range of bolh m^jthodological 
and substantive issues and naturally gravitate arotmd the interests of t!te sponsoring 
professon * The largest concentrations thus far come from the Mathenu>tics Education 
prGg:-«m at the University of Illinois at Urbana-Champaign and from the Research 
Methodology program at UCLA where both Bengt Muth6n's and my students use SIMS 
to try out the latest methodological developments within a substantively rich 
educational database or to explore within SIMS substantive questions that have 
historically been investigated in the educational (school, teacher, classroom, instruction, 
curriculum) effects literature. 
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Before turning to a.descdpHon of the focus and results of the doctoral 
dissertations, I want to condition my commenis by providing some perepecHve on my 
view of their nature and purposes. Empirically grounded doctoral dissertaHons tend to 
be "constrained" investigaHons. Under the best of drcatmstances, they arise through an 
evolving commitment of the student to a sustained, focused line of inquiry that serves 
as a foundation and source of ideas (and pubUrations) for iho first phase of the post- 
degree career. Often, despite expUdt and topUdt time and resource limits, the 
dissertation makes an original contribution by clarifying, elaboraHng, or extending 
current thinking, or on rare occasion, by challenging convenHonal v^ dom. The benefits 
to the Held are both primary (knowledge producHon) and secondary (i e development 
of a potentially productive new professional). 

aher purposes are served as well. Quite often, the research enables the 
dissertation sponsor to "extend" a line of inquiry they have started and contributes to 
the total mosaic of the senior scholar's research domain. In most of these cases, the 
general idea for the dissertaHon topic originates from the sponsor with the student both 
refining the idea to reflect his or her own noHons and executing the investigation under 
mutuaUy agreed upon guidelines. Hopefully, the student develops enough investment 
in the substance of the dissertation to make it his or her own. Otherwise, the dissertation 
serves mainly an exercise or demonstration and thus primarily a rite of passage rather 
than the substantive foundation for a career. 

One other general feature of the SIMS dissertations is worth noting. By necessity, 
these dissertaHons are all secondary data analysis projects. As such, the empirical 
investi«?aHon itseL' is constrained by the available quanHty and quality of data. And, 
even in such a massive data gathering activity like SIMS, certain measures weren't 
included and study samples were oriented in certain ways. For example, virtually all 
variables in SIMS were mattiemaHcs related; even student background measures of 
home support and resources were linked to mathemaHcs rather than to general 
encouragement and support for educatton. Also, even though the SIMS battery of test 
items was considerably larger Uian in previous lEA shidies, for certain types of studies, 
item sampling from certain topics is rather sparce. Finally, as with most other lEA 
studies, the SIMS -data .yre all of the survey self-report- type. While a student- using a 
givjrn ?et of SIMS questtons and items can be expected to examine their measurement 



properties, they typically have little recourse if specific meastires didn't work as 
intended.^ 

Overview of the Dissertations 

A list of all SiMS dissertations completed or in progress as of January 1989 is 
contained in Table 1. The namu of the dissertation ad\isor(s) are included in addition 
to the name of the author, year of completion, title, and institution. Where available, 
abstracts from the dissertations are appended. 

Topically, the dissertations break down into several that are primarily 
methodological (Delandshere, Kao, Lehman, Ryan) and the remainder which focus on 
substantive .aspects in mathematics teaching and learning and its measurement. 
Virtually every dissertation thus far has exanUned the achievement data in some 
foshion and included OTL or content coverage information. Several of the dissertations 
took advantage of special or unique features available in the longitudinal version of 
SIMS, such as the pretest data (Chang, Charles, Delandshere, Dhompongsa, Fagnano, 
damier, Hafner, Kanjanawassee, Kao) and the detailed classroom process questionnaires 
(Chang, Charles, Dhompongsa, Fagnano, Gamier, Hafner, Kao, Williams). Most of the 
dissertations used data from a single country (usually the U.S. although Thailand's data 
have been analyzed by three different tudents). Obviously, plenty of opportunity 
remains to take full advantage of the cross-national aspects of the data. 



lln many instances, there were built-in redundancies in measuring certain aspects of classroom and 
curriculum practices that help matters somewtvit. However, mud^i of wttat was tried with the 
classroom process instruments was quite novel and thus experimental, especially for such a large 
study. This put students in the position of having to carry out thar own validity investigations 
with very little literature to guide them. 
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Table 1. List of SIMS-related ^ertations completed or in progress. 



Univeraitv of British Cnltim Ma 

Michael K. DirU (1986). Opportunity to leam in grade 8 schools in British Columbia. 
Unpublished doctoral dissertation. University of British Columbia (David F. 
Robitaille, Advisor) 

IIltoatoi^CalilQfflia. Los Angeles 

Ginette Ddandshere (1986) Structural equation modeUng appUed to multi-level data: 
The effect of teaching practices on eighth grade mathemaHcs achievement. 
Unpublished doctoral dissertation. University of California, Los Angeles aeigh 
Burstein, Adviser) 

Cheryl L Fagnano (1SS8). An investigation into the effects of teachers' subject matter 
and subject specific pedagogy training on the mathematics achievement of 
eighth-grade mathematics students. Unpublished doctoral dissertation. 
University of California, Los Angeles (Uwis H. Solmcn and Leigh Burstein, 
Advisors) 

Helen E. Gamier (1988). Curriculum comparisons: Examination of eighth-grade 

ma hsmatics instruction data from the Second International Mathematics Study 
in the United States. Unpublished doctoral dissertation. University of California, 
Us Angeles (Marvin C. Alkin and Leigh Burstein, Advisors) 

Anne L Hafner (in progress). The use of teaching method scales in exploring the 
relationship between mathematics teaching styles and differential class 
achievement. Dissertation in progress. University of California, Los Angeles 
(Uigh Burstein and Richard J. Shavelson, Advisors) 

Sirichai Kanjanawassee (1989). Alternative sh-ategies for poUcy analysis: An assessment 
of school effects on students* cognitive and effective outcomes in lower 
secondary schools in Thailand. Unpublished doctoral dissertation. 
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Table 1. List of SIMS-related dissertations completed or in progress. (Continued) 

University of California Los Angeles (Marvin C. Alkin and L^gh Burstein, Advisors) 
Chl-Fen Kao (in progress). An investigation of instructional sensitivity in nuithematics 
achieve test items for VS. eighth grade students* Dissertation in progress. 
University of California, Los Angeles (Bengt 0. Muthen, Ac visor) 

James D. Lehman (1986). Opportunity to learn and differential item functioning. 

Unpublished doctoral dissertation. University of California, Los Angeles (Leigh 
Burstein and Bengt 0. Muthen, Advisors) 

University of fllinois. Urba na-Champaign 

Chang, U-Chu (1984). The effects of teacher and student perceptions of opportunity to 
learn on achievement in beginning; algebra in five coimtries. Unpublished 
doctoral dissertation. University of Illinois at Urbana-Champaign (James 
Hirstein, Advisor) 

Charles, Josephine (1985). Teaching mathematics in lower secondary schools in 

Swaziland. Unpublished doctoral dissertation. University of Illinois at Urbana- 
Champaign Games Hirstein, Advisor) 

Dhompongsa, GuUayah (1985). The teaching and learning of mathematics in eighth 
grade classes in Thailand. Unpublished doctoral dissertation. University of 
Illinois at Urbana-Champaign (Kenneth J. Travers, Advisor) 

Katherine E. Ryan (1987). A conceptual framework for invef ^gating test item 

performance with the Mantel-Haenszel procedure. Unpublished doctoral 
dissertation. University of Illinois at Urbana-Champaign (Robert L. Linn, 
Advisor) 

Staples, Peter M. (in progress). A study of changes in secondary school mathematics 
amongst nine countries between 1963 and 1983. Dissertation in progress. 
University of Illinois at Urbana-Champaign (Kenneth J. Travers, Advisor) 
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Wattanawaha, Nongsuch (1987). A study of equity in mathematics teaching and learning 
in lower secondary schools in Thailand. Unpublished doctoral dissertation. 
University of Dliriois at Urbana<3«mpaign (Kenneth J. Travers, Advisor) 

John B. Williams (1988). The teaching of calculus in high schools in the United States. 
Unpublished doctoral dissertation. University of Dlinois at Urbana-Champaign 
(Kenneth J. Travers, Advisor) 



Findings From DissertaHons 

What kinds of conclusions can be drawn from the results of the dissertations 
done thus far? Basically, my reading of the picture is In many instances, there were buUt- 
in redundancies in measuring certain aspects of classroom and curriculum practices that 
help matters somewhat However, much of what was tried with the classroom process 
instruments was quite novel and thus experimental, especially for such a large study. 
This put students in the position of having to carry out their own vaUdity investigations 
with very UtUe literature to guide them. That substantive results tend to reinforce and 
elaborate points hinted at in the main SIMS volumes and in The UnderachiPviny 
ClUltolllull is helpful. There are also consistencies with the prevaUing notions in the 
educational effects literature on classrooms and schools and on the power of curricular 
opportunities (both exposure and emphasis) as a component in mathematics 
achievement Examples of resiUts along the above Unes include: 

1- CcntralltV of Contgnt -Whether measured in terms of OTL (reported by 
teachers or by stuu<jnte), time aUocations, or content emphases, the effects of the content 
acluaUy covered on mathematics achievement are potent ( e.g., Chang, Delandshere, 
Gamier, Kao, uihman). Only prior performance (represented by the pretest) has a 
consistenUy stronger relationship to achievement (as represented by *e posttest) than 
the content coverage measures. Even after controUing for prior performance (and thus 
its effects on content coverage), content coverage remains influential. 

2. InflMCT«9fTfMhM)Vff- Different types of analyses appUed to data from three 
systems (British Columbia, Swaziland. US.) highlight the role played by textbooks in 
deiermining teachers' content decisions and instructional strategies, and their 
consequences fc students (Charles, Dirks, Gamier). According to Gamier, four 
frequentiy used American textbooks at Cn.de 8 differed in tenns of content coverage and 



presentation. Students in regular classrooms using one of these texts had higher 
achievement scores across all content areas and performed considerably better in 
geometry and on items tapping comprehension and application skills. (Of course the 
teachers using this text tended to be older, better educated/ and more experienced. These 
same teacherS/ more than other teachers, tended to emphasize problem solving skills 
and developing an attitude toward inquiry; they also provided more opportunity to 
learn and emphasized more teaching methods in all mathematics topics.) 

3 Weak Effects of Teacher Training - For the US. sample, at least, teacher 
training as typically measured by number of courses general education, mathematics, 
and mathematics pedagogy has little if any impact on student learning (Fagnano). 
Teacher training exhibited similar associations with both pretest and posttest 
performance, making it difficult to disentangle the unique influence of training on 
student learning. There were indications that the prevalence of certain classroom 
processes and teacher subject matter beliefs were influenced by training in mathematics 
and mathematics pedagogy, but there were inconsistent results regarding the indirect 
effects of trai "ng (through its effects on processes and beliefs) on achievement. 

4. Pervasive Influence of Prior Performance - Regardless of the focus of the 
dissertation, the consequences of including the pretest to measure prior knowledge and 
mathematics ability were considerable. The pretest is the strongest predictor of posttest 
everywhere and in all content areas. It is also typically more strongly associated with 
most student background variables than the posttest. Consequently, controlling for prior 
knowledge in analyses of achievement typically eliminates the influence of most 
student background variables. Prior knowledge as measured by the pretest is also 
associated with teacher attributes, curricular opportunities, and instructional practices 
and processes. As a result, the effects of the latter types of variables on posttest are often 
dampened rather heightened by controlling for prior performance. Taken as a groap, the 
results from the various dissertations clearly highlight the delicate task of exploring the 
distinction between knowing and learning (or, alternatively, status and growth) and the 
effects of student, teacher, class, and school characteristics on either or both. 

On the methodological side of the ledger, not surprisingly, we learn that it does 
matter how you measure achievement and instructional experiences, and how the 
hierarchical, multilevel structure of che data is taken into consideration in analyses. As 
examples: 

1. Specificity of Outcome Measures — While some dissertations used total scores 
across :tii ..5t items (or all items on the core) as outcomes, when subtests defined by 
content area or some o^her feature of test items are used, the patterns of relatior hips to 
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other measures (curricular opportuniHes, instructional pracHces and processes) tend to 
vary. Individual test items within narrowly defined content categories were 
differentiaUy sensitive to student and instructional characteristics in some cases 
(Lehman, Kao, Hafoer). Qearly, the specificity of the outcome measure mattered (as did 
the use of posttest only, gain, or adjusted gain). 

2- Specification of Opportunity tn T.pam _ Once someone decided to use OTL in 
their investigation, the choice of which measure to use remained. Both teachers (for all 
items) and students (for core items at both pretest and posttest) were asked whether the 
content necessary to answer individual item£ was taught or reviewed during the year. 
Moreover, teachers were also asked to indicate whether the content was taaght in prior 
years if not taught during the year. These different measures capture overlap^^ing but 
nonisomorphic features of perceived content coverage. Both their interrelationships 
and their relationships to othjr variables accentuate certain aspects of the instixictional 
opportunities experienced by students. As such their commonaUties and distinctions 
influenced the conclusions reached in various dissertations (e.g., Chang, Fagnano, 
Lehman, Kao); a different choice would likely '-^ve resulted in different conclusions. 

3. Choice of Relational Analysis Strateg; Some dissertations conducted aU 
relational analyses at the shident level whUe others conducted all analyses at the class or 
school level. Yet others did both, or employed several variants of multilevel anal)'sis. In 
studies where different analytical methods for handling the multilevel structure of the 
data were contrasted, the substantive interpretations suggested by different analytical 
strategies changed (e.g., Delandshere, Fagnano, Kanjanawassee). TypicaUy, conducting 
analyses solely at the student level yielded a greater number of purportedly significant 
effects of class and school variables although significance levels were usuaUy inflatec m 
such analyses. Conducting analyses solely at the macro (class, school) level tended to 
mask within-school relationships of student backgiound and prior performance 
measures to achievement and also interactions between shident characteristics and 
instructional characteristics in accounting for achievement outcomes. 

As reflected in the above examples, the complexity of examining survey data on 
teaching and learning comes through loud and clear in most of the dissertations. In 
most cases, the investigations started with descriptive reports of bivariate relationships 
among the variables of interest and proceeded to condition successively on confounding 
variables whose effects might have been mistakenly overlooked '^ interpreting a 
specific relationship. There a^e subtle intricacies in interpreting survey data and in 
hying to nail down such elusive conshTicts as teaching, instruction, curriculum, 
achievement, and learning. Experience can - help but may not be decisive. The 
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dissertations examined here represent noble '4nd often notable efforts to come to grips 
with complicated schooling data. That the results from specific dissertations are no more 
or no less definitive or iUimunating than other analyses of SIMS and analyses of other 
data bases is to be expected. The fact remains that much has been learned from these 
efforts about how to address various issues using SIMS as a data resource. 

Other Possible Dissertation Topicg 

While a lot of fertile ground has been covered by the dissertations described here, 
most of the issues already investigated using the SIMS data could warrant further study. 
Moreover, there is a considerable amount of as yet untapped territory that is conducive 
to dissertation research. Areas that warrant further scrutiny include the following: 

1. Determinants of the EHstribution of Curricular Opportunities - 1 pointed out 
earlier that in the U.S., variables that predict posttest performance are often as highly 
associated with pretest performance. This pattern of results naturally leads to the Kiiefs 
question of "who gets what?" Yet, as best as I can determine, none of the dissertations 
thus far and none of the SIMS analyses other than Kifer's, focuses on the factors that 
account for the distribution of curricular opportunities and instructional experiences. It 
was argued early on by Kifo: and Wolfe, among others, that the most interesting 
relationships in the longitudinal version of SIMS would involve the pretest rather than 
the postcest. There are obviously competing conceptions of what constitutes appropriate 
mathematics for students of varying ability and prior experience levels; moreover, the 
prevalence of certain conceptions varies cross-nationally. In-depth consideration of 
competing conceptions regarding access to mathematics content and how SIMS data 
might illuminate them would be welcome. 

2. Interconnections of Coverage and Emphasis - For whatever reason, most of 
ti>e dissertations have shied away from detailed examinations of the various ways of 
measuring content coverage and emphasis. (The topic-specific teacher Questionnaires are 
still sorely underanalyzed despite the attention given them in the longitudinal 
volume.) What I would like to see are theory-driven conceptions of coverage and 
emphasis operationalized in a variety of ways and then the empirical consequences of 
using different operationalizations considered. So far this has been attempted mainly for 
OFL (e.g., Chang, Kao, Lehman). 

3. "Case Studies" - Robin's analysis in the longitudinal volume points to clusters 
of teachers who tend to have common beliefs and employ similar constellations of 
instructional practices. While certain dissertations attempted to create clusters of 
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teachers with similar "styles" (e.g., Delandshere, Dirks, Hafner, WUliams), most have 
focussed on a smaU portion of the data provided by teachers. I have suggested before that 
one way of viewing the longitudinal SIMS data is as a large number of detailed "case 
studies". What I meant by this characterizaHon is thav (Ach teacher provided a 
considerable amount of information and if these data were approached as if each teacher 
represented a separate case study, perhaps we could gain more insights about what 
constitutes the array of instructional treatments in mathematics. I could foresee 
identifying small subsets of teachers with specific constellations of responses to the 
teacher questionnaires and attempting to characterize these patterns and their 
consequences. Robin attempted this by brute force empirical methods but clearly more 
theory-driven approaches are possible. 

4. Ignoring pr Capitalizinf on Cultural Bnundarii»<i - in several instances, I have 
suggested to students that perhaps they could learn more about classroom processes by 
pooUng SIMS data aCToss countries. In this way, variation instructional practices and 
processes inaeases considerably and certain contrasts can be illuminated. For example, 
imagine combining data from enriched and algebra classrooms in the U.S. with data 
from, say, French and Japanese classrooms and restricting attention to the set of test 
items for which most classrooms in the pooled data set had an opportunity to learn. In 
this data set, any culturally distinctive approaches to teaching mathematics and 
subsequent performance are likely to be highlighted for groups of students experiencing 
common curriculum intents (of course, what's excluded or not measured still matters). 
This is just one example of how cross-national data might benefit inquiry into issues of 
interest in a particular country. 

5. Grade 12 - At grade 12, there were longitudinal versions of SIMS conducted i: 
both British Co'.jmbia and the U S. Yet these data remsin virtually ur.analyzed beyond 
the national bi'mma'y reports (Only Williams' dissertation con5.dered grade 12 data). 
While there were certain structural complexities built into data collection at grade 12 
that don't exist at Grade 8, regularities in -instructional practices and processes are likely 
to be more evident. 



C oncl udin g Remarks 

Again, the above do not exhaust the possible areas of fruitful dissertation 
investigations that could use SIMS data. Nor are these topics likely to be any easier to 
study than those already investigated. Nevertheless, we believe that the dissertations 
completed and in progress clearly attest to the value of SIMS as a dissertation resource 
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and the value of the dissertations in achieving a better understanding of the issues that 
analyses of SIMS data can address. Given the array of other avenues of empirical work 
possible with SIMS data and that institutions beyond Illinois and UCLA might want 
their students to take advantage of this data resource, we have hopefully only seen the 
tip of the iceberg with regard to the use of SIMS data for dissertation research. 
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U-Chu Chang (1984) Ti.e effects of teacher and student perct tions of opportunity to 
learn on achievemenl in beginning algebra in five countries. University of 
Illinois at Urbana-Champaign 

The purpose of this research was to investigate the relationship between teachers 
and students' perceptions of the opportunity to learn and student achievement in 
algebra in 1 3-year-old students. Population A of the lEA Second International 
Mathematics Study. Two related contexts were considered: (1) student entry knowledge 
in mathematics and (2) the content domain being taught. 

The data used in this study came from five coimtries: France, Japan, New 
Zealand, Ontario-Canada, and the United States. The data analyses were done at the class 
level. Foiu- research questions were addressed in this study. 

Are teacher opportunity to learn and student opportunity to learn good 
predictors of student achievement? Both teacher opportunity to learn and student 
opportunity to learn, taught this year, positively influenced achievement. However, in 
some countries the opportunity to learn variable had a small effect on achievement 
because of the homogeneity of the curriculum or the effect of having previously been 
taught the topic. 

Which is the better predictor of student achievement, teacher opportunity to 
learn or student opportunity to learn? Although the opportunity to learn as perceived 
by teachers is consistently higher than the oppoitunity to learn as perceived by students 
in the corresponding classes, the student opporttmity to learn rating is a- better predictor 
of achievement gain than is the teacher opportunity to learn rating. 

What is the relationship between the coverage and student achievement gain for 
each level of entry knowledge in mathematics? For each ability group, the mean teacher 
opportunity to learn score is higher than the corresponding mean student opportunity 
to learn score.- Student opportunity to learn is a better predictor of student achievement 
gain than the corresponding teacher opportunity to learn for high and middle ability 
classes. 

What level of coverage is optimal for student achievement gain in classes of 
high, middle and low knowledge in mathematics? A high student opportunity to learn 
rating appears to be an optimal condition for high and middle ability students. Time 
allocation itself was not a salient factor of achievement gain and no significant 
interactions between opportunity to learn and time allocation were found. 
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Josephine H. Charles (1985) A study of the teaching and learning of common and 
decimal fractions in the eighth grade in SwazilandUniversity of Illinois at 
Urbana-Champaign (James J. Hirstein, Advisor) 



Purpose. The purpose of this study was to investigate the teaching and learning 
of coroaon and decimal fractions in the eighth grade in Swaziland. The study focused 
on the three aspects of the mathematics curriculum: (a) the intended curriculum as 
reflected in curriculum guides, course outUnes, syllabi and tey^books; (b) the 
implemented curriculum at the classroom level where teachers translate the intended 
curriculum; and (c) the attained curriculum what the students have learned as 
measured by the tests and questionnaires . 

Procedures and Analysis. The data lend themselves to three major classifications: 
(a) curriculum data-context survey and textbook analysis; (b) classroom data include the 
Teacher, Topic and Attitude Questionnaires; and (c) student data include cognitive and 
attitude data. 

The definition ^f Population A was modified for Swaziland as the grade level 
where 13 year-old students should be found according to the school system. A pre-test 
was administered to 904 students in 25 classrooms in February 1980 and a posttest in 
Septenber 1980. The teachers responded to the Classroom Processes (Questionnaire for 
common and decimal fractions. Results of the Teacher (Questionnaires and student 
achievement tests were analyzed using Pearson's Correlation and ANOVA. 

Selected Findings and conclusions. The classes were identified as remedial- 
typical enriched or accelerated with an average class size of 27. An equal amount of time 
is spent on fractions and other topics in the mathematics curriculum. The majority of 
the teachers were young and inexperienced. Much of the teachere' time is spent on 
presenting new content or reviewing old material and a relatively small proportion of 
time is spent on discipline or administration tasks. The textbook provided the 
"boundaries" for what is taught. Limited use is made of resources beyond the textbook 
for either content or methods of teaching. The majority of student time is spent 
listening to teacher presentation, doing seat work or taking tests. LitUe time is spent on 
group work. Inshiiction in fractions tends to be symbolic and formal with an emphasis 
on computational proficiency. Students* performance is higher on common fractions 
and on application level items. Both the teachers' attihides and beliefs and students 
attitudes and beliefs had no effect on student achievement. 
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Ginette Delandshere (1986) Structural equation modeling applied to multilevel data: 
The effect of teaching practices on eighth-grade matheniatics achievement. 
Unpublished doctoral dissertation. University of California, Los Angeles (Leigh 
Burstein, Advisor) 

Student achievement is mostly affected by three types of variables: student ability 
or aptitude, student characteristics (i.e., home background), and various combinations of 
teacho- and classroom characteristics. 

The present study questions the adequacy of the analytical models traditionally 
tJCi-d in school and classroom effect research. It is assumed here that the variability in 
the relationship between student characteristics and achievement could be more 
effectively exantined as a function of students' instructional experience. The analytical 
sc^ .me proposed here is intended to reflect the multilevel nature of the data, to take 
measurement error into account, and to allow for the examination of the 
interrelationship among the predictors of student achievement. 

The investigation is carried out with data collected from students and teachers in 
226 U.S. eighth-grade mathematics classroonw (Second International Mathematics Study 
under the auspices of lEA). The analytical scheme tested here includes the following 
steps: 1) classification of teachers according to instructional practices using ttiree 
clustering algorithms (K-means, Ward's method, and NORMDC), 2) comparison of the 
effect of group membership defined by clustering on achievement to more traditional 
methods (regression and ancova), 3) estimation of a student achievement model (using 
LISREL) within each group as defined by clustering, and 4) comparison of the model 
across groups to assess the structural differences in student achievement due to 
differences in instructional practices. 

A five cluster solution was retained, and cluster membership was found to 
account for an amount of variance comparable to that which would be explained by 
regressing achievement directly on the teacher variables used to identify the clusters. 

Structural equation modeling was then used to fit a student achievement mcd^^l 
separately in each cluster. A gv-^od fit was obtained for the model in at least three of the 
clusters. Finally, a multiple group analysis was conducted on the three clustp s, 
revealing differences in the structural parameters across groups. 
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Guliaya T. Dhompongsa (1984). The teaching and learning of mathematics in eighth 
grade classes in Thailand. University of. lUinois at Urbana-Champaign 



This study surveyed and analyzed data relating to classroom processes and 
student achievement in mathematics in Thailand. It also inquired into the relationships 
between such processes and achievement, and investigated the differences in 
instructional behaviors among teachers whose students exhibited low learning gain. 
Furthermore, the study examined factors affecting student achievement in mathematics. 

The study was conducted in ThaUand in conjunction with the Second IE A 
International Kiatiiematics Study. The sample, drawn Chrough the use of tiie probability 
proportional to size (PPS) sampling procedure, consisted of 45 classrooms from 23 
schools in 10 provinces, with two classes per schools and a total of 1,910 eighth grade 
students. The data collected included studenis' pretest and posttest achievement, 
classroom procasses reported by the teachers, and information on student home 
background, teacher characteristics and school conditions. These data were obtained 
through the administration of relevant tests and questionnaires. 

Descriptive results regarding the ways the teachers provide instruction of ratio, 
proportion and percent were reported both verbally and graphically. Sorv -f the more 
important findings obtained from the multivariate analvses are as follows: 1) student 
prior knowledge in mathematics and consistency of instruction contribute the most to 
student post-achievement variance. 2) The variables associated with high-gain teachers 
seem to be consistency of instruction, use of class time in explaining new content and in 
managing the classroom and emphasis on practice and drill more than on problem 
solving. 3) The variables associated to low-gain teachers seem to oe the use of a variety of 
teaching techniques and the emphasis on problem solving more than of practice and 
drill. 4) Students' prior knowledge of mathematics appears to affect students' final 
achievement in mathematics directiy and strongly, while the classroom process factors 
seem to have negligible effect on achievement. Other background factors show minimal 
indirect effect on achievement, but home s»atus and processes in the home strongly and 
direcUy affect student prior knowledge in mathematics, which, in turn, affect students' 
final achievement. 
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Michad K. Dirks (1986). The operational cimicula of Mathematics 8 teachers in British 
Columbia. The University of British Columbia, Canada (David F. Robiraille, 
Supervisor) 

The purpose of this study wa*^ to describe the inathematics curricula as actually 
implemented by a sample of Mathematics 8 teachers in British Columbia. A survey of 
previous research indicated that knowledge about the mathematics subject matter which 
teachers present to their students and the interpretations which teachers give to that 
subject matter is sparse in spite of the importance such knowledge might have for the 
cuniculiun revision process, textbook selection, the identification of in-service 
education needs, and the interpretation of student achievement resulb. 

The n'athematics 8 curriculum- was divided into three content areas: arithmetic, 
algebra, and geometry. Within these content areas a total of 16 topics were identified as 
among the basic topics of the formal Mathematics 8 course. Four variables were 
identified as representing important aspects of a mathematics curriculum. The first of 
the^e, content emphasis, was defined as a function of the amount of time a teacher spent 
on each content area. The other three variables, mode of content representation, rule- 
orientedness of instruction, and diversity of instruction, were defined as functions of the 
content-specific methods teachers used to interpret the topiat to their students. 

Class achievement level and the primary textbook v^ere identified as having 
strong potential relationships with a teacher*s operational curriculum. These were used 
as background variables in this study. 

The data for this study were collected as part of the Second International 
Mathematics Study during the 1980-1981 school year. The sample consisted of 93 teachers 
who submitted five Topic Specific Questionnaires throughout the school year regarding 
what they taught to one of their Mathematics 8 classes. 

Among the findings of this study were: (1) Wide variation existed in the 
emphasis given by teachers to the three content areas with 60% giving at least one area 
light or very light emphasis. (2) Teachers using a text which placed more emphasis on a 
particular content area tended to spend more time on a particular content area in their 
classes. (3) Teachers of low achievement classes tended to present mathematics in a 
slightly more abstract and rule-oriented way than teachers of high achievement classes. 
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Cheryl L. Fagnano (1988). An investigation into the effects of teachers' subject matter 
and subject specific pedagogy training on the mathematics achievement of 
eighth-grade nwthematics students. Unpublished doctoral dissertation. 
University of California, Los Angeles (Uwis H. Solmon and Uigh Burstein, 
Advisors) 



This study addresses the empirical questions; does the amount and kind of 
training a mathematics teacher acquires affect his or her choice of classroom processes, 
pedagogical beUefs, and ultimately student's mathematics achievement? The United 
States 8th grade sample from the Second International Mathematics Study was the data 
source. Using a two stage analysis, four models of multiple regression were used to 
investigate the study's hypothesis. Three outcome measures were investigated, student 
posttest scores, classroom processes, and teacher pedagogical beUefs. The major 
independent variables of interest were three types oi teacher training, subject specific, 
pedagogical, and general education. The study's i-nst significant finding was that 
increased amounts of pedagogical training was n. ^atively associated with student 
achievement. This finding whUe suggestive was inconclusive due to problems of multi- 
colinearity be'.ween measures of teacher and 
student quality. 
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Helen E. Gamier (1988)* Curriculum comparisons: Examination of eighth-grade 

mathematics instruction data from the Second International Mathematics Study 
in the United States. Unpublisht^ doctoral dissertation. University of California, 
Los Angeles (Marvin C. Alkin and Leigh Burstein, Advisors) 

Previous research has identified the classroom textbook as the major contributing 
factor to determining what teachers teach and what students learn. Given the enormous 
potertlc^ of the textbook to guide instructional processes, the textbook is an essential 
variable to b^ included in any comparison of different curricub. National nuithematics 
data from the Second International Mathematics Study provided information on 
students, teachers, and instnictional processes. The four most frequently used U.S. 
eighth grade mathematics textbooks from that study were used to investigate curricula 
characteristics. The extent to which different textbooks influenced different instructional 
processes and different patterns of student achievement were examined. 

Qualitative comparisons of the textbooks indicated both differences in content 
coverage and presentation. Defs^ptive analyses of student, teacher, and instructional 
process measures identified statistically signiflcar.t differences in mathematics curricula. 
The degree to which student, teacher, and instructional process variables explained 
variation in mathematics achievement scores also differed across the curricula defined 
by textbook choice. 

The results of the analyses provided essential information about evaluation 
questions of effectiveness and causality. Students in typical classrooms using the more 
advanced mathematics textoook had significantly higher mathematics achievement 
scores in arithmetic, geometry, measurement, and algebra. They had the iai5cat gains in 
geometry, aitd in comprehension and application skills. Teachers using the more 
advanced textbook were the oldest, most experienced, and most educated teachers. They 
emphasized problem solving skills and developing an attitude of inquiry more than 
other teachers. They provided more opportunity to learn and emphasized more 
teaching n.ethods in all mathematics subjects. These teachers ujed self-written materials 
more than other teachers. 

The an*:Wses suggest further studies might be done on the contribution of teacher 
and instructional process variables to explain the variation in mathematics 
achievement scores for remedial and enriched students. Also mc-^e detailed analyst, of 
mathematics topics within arithmetic, geometry, measurement, and algebra are 
suggested. 
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Anne L Hafner (in progress). The use of teaching method scales in exploring the 
relationship between nwthematics teaching styles and differential class 
achievement Dissertation in progress. University of California, Los Angel 
(Leigh Burstein and Richard J. Sh^v. ,^n, Advisore) 



The purpose of this study is to examine the influence of sp^xdfic teaching practices 
on class-level student mathematics achievement. Prior studies have identified general 
teaching behaviors which are related to student achievement, but not math-specific 
behaviow. The major contribution of this study is to identify teaching styles/pr«.rices in 
the mathematics content domain which influence class mathematics achievement. In 
addition, the study wiU attempt to disentangle t le OTL (content coverage) influence 
from the teaching method influence. 

The study tests the hypothesis that teaching practices will influence differential 
content coverage (OTL) above and beyond the influence of prior clas? achievement and 
background variables. It also hypothesizes that after uontrolUng for OTL and background 
variables, differential performance between classes wiU still exist which may be 
athibutable to teaching styles or practices. FinaUy it is hypothesized that classes taught by 
various teaching "styles" will show differential achievement, and that practices which 
focus on percephial presentation, which sti-ess an inforrtial approach that links across 
mathematics concepts and which use multiple concept interpretations wUl best predict 
high achievement. 
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Sirichai Kanjanawassee (1989). Alternative strategies for policy analysis: An assessment 
of school effects on students' cognitive ar.u effective outcomes in lower 
secondary schools in Thailand. Unpublished doctoral dissertation. University of 
California Los Angeles (Marvin C. Alkin- and Leigh Burstein, Advisors) 

The purpose of this study is to consider alternative multilevel strategies to assess 
the school effects on various dimensions of student outcomes. The present study 
questions the adequacy of the conceptual analytical models used in school effectiveness 
research. The conceptual strategies proposed here v^ere intended to obtain a relevant 
model which reflects the multilevel nature of educational data, while the analytical 
strategies which take into account the multilevel structure were aimed to allow for the 
variation of coefficient estimates between levels, and to test the fit of school effect 
models. The investigation was carried out with data from the Second International 
Mathematics Study (SIMS) collected in Thailand from 4,030 eighth grade students and 
their mathematics teachers and administrators in 99 schools. The analytical strategies to 
detect, explain, and compare the school effects included variance component analysis, 
standard regression analysis, hierarchical analysis of covariance, and selected multilevel 
analysis techniques (OLS single equation, OLS separate equation, and HLM approaches.) 
The major findings can be summarized as foUows. 1) The alternative ctrategies for 
traditional multilevel analyses are needed in order to provide more realistic, 
infom ative, and accurate assessment of school effects. 2) Thai schools did differ in 
enhancing students' status and growth in cognitive and affective mathema^cs 
outcomes. 3) The outcome variables were affected by multilevel variables: student 
backgrounds, class/school characteristics, and socio-cultural contexts. 4) The important 
variables affecting the outcomes were students* prior achievement and afrtitudc, 
expectation for further education, use of home calculator, parents* contribution to the 
learning, parents' motivation, peers* achievement, class size, teacher experience, 
student-teacher ratio, and qualified mathematics teacher ratio. The student- 
backgrounds tended to have strong effects on students* status in cognitive and affective 
outcomes, whereas, the class/school characteristics tended to have strong effects on 
students' growth in cognitive and affective outcomes. 
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Chi-Fen Kao (in progress). An investigation of instructional sensitivity y\ mathen^ut:c:> 
achieve test items for U.S. eighth grade students. Dissertation in progress. 
University of California, Los Angeles (Bengt 0. Muthen, Advisor) 

The purpose of this dissertation is to further elaborate and study the applicability 
of the extended IRT model developed by Muth^n (1987). Muthen's approach allows for 
the incorporation of auxiliary information about the background and characteristics of 
students in the estimation of an IRT measurement mc The effects of auxiliary 
variables on ability estimates and the effects of ability c,ad auxiliary variables on 
performance can be estimated v/ithin a common modeling framework. 

The dissertation focuses on refinements in the investigation of the instructional 
sensitivity of test items using the SIMS data base. In earlier analyses family background 
and item specific opportunity-to-leam (OTL) information were used in studying 
performance on the items from the core test. The work is expanded in the following 
ways: (1) the analyses will be done with the pool of 180 items from both core and rotated 
forms with procedures developed to handle the "random missingness" involving the 
rotated forms; (2) the array of instructional variables will be extended beyond item- 
specific OTL; (3) new ways will be developed to handle OTL other than as item specific 
Influence. 

The study attempts to answer such questions as: 

Do instructional coverage effect achievement performance in addition to its 
effects on latent ability? 

If an Item is instructionally sensitive, is it still a good measurement of the ability? 
What kinds of items tend to be sensitive? 
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James D. Lehman (1986). Opportunity to learn and differential item functioning. 

Unpublished doctoral dissertation. University of California, Los Angeles (Leigh 
Burstein and Bengt 0. Muthen, Advisors) 



This student was intended to examine the impact of differences in opportunity to 
learn (OTL) item content on the functioning of items and the degree to which such 
deferences can lead to improved understanding of the results from investigations of 
i.'em bias. The present si dy sought to demonstrate two things: (1) student differences in 
opportunity to learn item content cause differential item functioning (DIF); and (2) 
statistical indications of item bias (in the present case associated with gender) confound 
differences in item functioning attributable to gender with those dae to differences in 
opportimity to learn. An Item Response Theory approach to item bias and differentia^ 
item functioning was used to address the questions of the study. The data source was a 
sample of eighth grades in the U.S. who participated in the Second International 
Mathematics Study (SIMS). The items investigated were taken from the 40-item core test 
of mathematics. The analysis focused primarily on algebra items from this test because 
of the substantial variability of OTL across students in this topic area. 

The primary results can be summarized as follows: (1) All eight algebra items 
exhibited differential item functioning associated with differences in OTL. Specifically, 
the item characteristic curves for high and low OTL groups indicated that students of a 
given ability level in the OTL groups had a higher probability of getting the algeora items 
correct than members of the low OTL groups. (2) Evidence of possible gender bias was 
found in only two of the eight items. Thus it was not possible to conclude thai OTL DIF 
confounds gender DIF. The lack of confounding must also be attributed to tl e very 
similar levels of OTL between boys and girls. However, OTL DIF in this population on 
this type of test was clearly shown. 
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Katherine E. Ryan (1987). ' conceptual framework for investigating test item 

performance with the Mantel-Haenszei procedure. Unpublished doctoral 
dissertation. University of Illinois at Urbana-Champaign (Robert L. Linn, 
Advisor) 



Recently, the Mantel-Haenszei (MH) procedure has been suggested as an 
alternative procedure to IRT methods for investigating item bias (Holland & Thayer, 
1986; McPeek & Wild, 1986). How :ver, there are few studies examining the stability of 
the MH procedure across different samples of test takers (See McPeek & Wild, 1986). No 
studies have examined whether the Mantel Haenszel estimates are stable within 
different sets of items. This study examined the stabUity of the MH estimates across 
different samples of test takers as well as across different sample sizes: investigated 
whether the MH procedure is robust with respect to item context effects; and whether 
the identification of differential item functioning can be improved by controlling for the 
multidJmensionality of the matching criteria by controlling on an additional criterion. 
Results indicated that a sample of 6000 for black-white comparisons was not adequate for 
obtaining stable estimates from the MH procedure while the MH odds ratio appears to be 
robust to item context effects. 
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Nongnuch Wattanawaha (1986). A study of equity in matheoiatics teaching and learning 
in lower secondary school in Thailand. University of Illinois at Urtana- 
Champaign (Kenneth Travers, Adviser) 



The major purposes of this study v^ere 1) to assess the extent to v^hich the recently 
reformed (1978) Thai national curriculum lias been implemeniea by teachers in 
different regions of the country, 2) to assess the extent of variation of student 
achievement across regions and across schools, and 3) to explore some determinants of 
achievement patterns that are potentially within the control of the school system, 
particularly content coverage and classroom practice. This study was undertaken as part 
of the Second International Mathematics Study (SIMS), using international data as well 
as national data. 

The findingr suggest that 1) There are no significant differences in the coverage of 
major areas of content across any of the analytic units investigated in this study, 
educational regions, social-cultural contexts, and classrooms of different achievement 
levels. 2) There are no significant differences in student achievement among 
educational regions bui there are differences at the class level which tend to be associated 
with rural and urban environments. 3) High-achievmg classes and low-achieving 
classes do not vary in content coverage, but do show patterns of differences which can be 
interpreted in terms of conceptic t of active teaching proposed by Good, Grouws, and 
Ebmeier (1983). 

The design of SIMS nermitted a comparison of Thailand with other nations. Thai 
national achievement is lower than that of most other nations, but when the 
achievement of students in Bankok is compared with other nations, the ranking is 
similar to that of the United States of America and New Zealand. 
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John B. WUliams (1988). The teaching of calculus in high schools in the United States. 
Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign 
(Kenneth J. Travers, Advisor) 



This thesis utilized data from the Second International Mathematics Study to 
characterize U.S. high school calculus classes and to identify aspects of teachers and 
teaching of calculus that accounted for differences in class achievement. The factors 
examined were (a) the degree to which teachers' presentations of mathematics were 
process oriented, (b) the degree to which teachers used formal methods of instruction, (c) 
the extent to which teachers relied on the textbook, (d) the percent of time that the class 
spent working in smaU groups, and (e) the percent of time that students spent working 
alone. A detailed profile of high school calculus teachers and classes was developed, 
including such variables as teacher background, curriculum content, manner of teacher 
presentations, and decisions regarding the teaching of the target clas:;. 

Of the five factors and their interactions, none showed a significant relationship 
to achievement. Exploratory analyses suggested that classes which spent less time in 
small groups showed a greater achievement in comprehension accompanied with 
higher variance in overall achievement. The data suggested that greater teacher reliance 
on the textbook coupled with more time spent in small groups was associated with 
lower achievement at both the r-mputation and comprehension levels. Finally, teacher 
presentations containing formal proofs were associated with greater variance among 
classes at the higher cognitive levels of achievement. Implications -for future 
development of the high school curriculum in calculus are included. 
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CONTENT REPRESENTATION IN COLLEGE ALGEBR.!: 
SUMMARY REPORT 

Peter Lochiel Glidden 
University of Illinois at Urbana-Champaign 

Abstract 

The Second International Mathematics Study College Algebra Classroom Process Data 
for Population B were examined: (a) to study reasons dted by teachers for teaching subtop- 
ics, (b) to study reasons cited for selecting particular content representations, and (c) to 
determine what relationships exist, if any, between teachers who use multiple content 
representations and their teaching decisions, professional opinions, backgrounds, classes, 
and schools. Important differences were found in the reasons cited for and against teaching 
subtopics. Considerable differences were found in teacher choice of representation and in 
reasons dted for and against use of a particular representation. Relationships were found 
between teachers who use multiple representations and: (a) their development and use of 
supplemental materials, (b) their presentation of content, ;»nd (c) their sources of ideas for 
applicattons. Evidence v/as found relating the use of inultiple representation to teacher 
background and education. 

The method or strategy used to present or interpret a mathematical concept is 
important for curriculum designers, textbook authors, and dassroom teachers. This finding 
is consistent with and supported both by traditional learning theories (e.g., Ausubel, 1968; 
Piaget, 1975; Novak 1977) and any general or extensible cognitive sdence model of learning 
(e.g., Winston,1972; Lebowitz, 1983). 

McKnight and Cooney (in press) examined content representation for Population A 
for all systems completing the classroom process surveys. Their study investigated various 
aspects of representation used induding: use of symbolic vs. perceptual representation, 
variety o^ representations used^ balance between symbolic and perceptual representations, 
and teacher opinions vs. representations used. They found sotr.e evidence of a relationship 
between time allocated for instruction and variety of representations used and some indi- 
vidually interesting results about teacher opinions. But no dear, overall patterns emerged 
from the data. 
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Goals of this Analvsiy 

Instead of examining content representations acros« systems, we shaU inaease the 
magnification of our microscope and consider content representations for Population B 
CoUege Algebra in the United States, The major goals of the present study are threefold: (a) 
to examine how logarithms and complex numbers are taught; (b) to determine why teach- 
ers choose particular concept representations; and (c) to determine what relationshipc, if 
any, exist between teachers' use of multiple content representations and their teaching 
decisions, professional opinions, backgrounds, classes, and schools. 

What is Being Taught about Complex Numbers and Logarithms 

Rgure 1 illustrates subtopic coverage for the topics of logarithms and complex 
numbers. (N teachers = 153) A subtopic such as complex roots of quadratic equations is 
"covered ' if it has been taught as new or reviewed and extended or reviewed only. A 
subtopic is "not covered" if it Ls assumed or not assumed and not covered. (FuU descrip- 
tions of the labels along the vertical axis are given in Appendbc 1.) Polar coordinate repre- 
sentations of complex numbers and DeMoivre's Theorem overwhelmingly are taught as 



Figure 1 

Coverage of Logarithm and Complex Number Subtopics 
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new, and complex roots of quadratic equations and laws of logarithms are almost equally 
taught as new and reviewed and extended. The rentaining subtopics are covered mostly as 
new. 

Figure 2 illustrates the positive reasons given by teachers for teaching the subtopics. 
Teachers were asked to mark as many reasons as applied. (For this and the rcr^ing 
figures the subtopics are displayed in order of decreasing coverage within each topic.) The 
subtopics most often taught usuaDy have the most reasons cited why the subtopic should 
be taught. For all subtopics, useful later is the most frequently dted reason followed by 
text (for six out of eight), syllabus/external examination, the subtopic is well known to the 
teacher, and the subtopic is related to prior mathematics. With the single exception of 
DeMoivre's Theorem, mathematical content reasons (related to prior or useful later) 



Figure 2 

Positive Reasons Given for Teaching Subtopics 
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consistently provide most of the reasons for teaching subtopics. These reasons are followed 
closely by external reasons (text and syllabus/external examination). 

By contrast, Rgure 3 illustrates reasons cited for nut teaching a subtopic As might 
be expected, subtopics taught less frequently have more reasons cited for not teaching them 
than frequently taught subtopics. Overwhelmingly, teachers cite external reasons (sylla- 
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Figure 3 

Reasons Given for ISDT Teaching Subtopics 
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bus/external examination most frequentiy, foUowed by text) for not teaching subtopics 
with never considered playing a supporting role. Easy to teach, enjoyed by students, or 
easy for students to understand rarely are cited either for or against leaching particular 
subtopics. Therefore, the data in Figures 2 and 3 suggest that for a teacher to decide to 
teach a subtopic, not only must the subtopic be included in the syllabus or text, but the 
teacher must be familiar ^th the topic, know how the topic will be useftil later, and know 
how the topic relates to prior mathematics. 

Concept Representations 

Description of Representations 
Complex N»m^H>rff 

The following four interpretations of complex numbers were considered in the 
survey (SIMS, 1985): 



i « 1. From a:* + 1 = 0, we define = 1 and then use the distributive property to 
give a rationale for the product: 
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fa biXc + di) - oc + Wi^ + fed + adi = fflc-M,^ + fbc + ad)i 

Dilation, Rotation. Multiplication is considered as a rotation transformation fol- 
lowed by a dilation (stretch or shrink) transformation. 

If z, = a + W = (cosa + i sin a) and z^^r^ di - (cos b + 1 sin W 
then Zj = fj {cos(a+b) + i sin (a+b)} 
Dilation Rotation 

Definition of Multiplication* Multiplication is defined by stating 
(a + bi) (c f di) = flc-M) + flv + aJ)i and then it is verified tiiat multiplication in C satisfied 
the various Algebraic properties of a field. 

Ordered Pair. Multiplication is defined as follows: 

If = (a,b) and = (c/i) 

then = (ac-bd/ bc+ad) 
After the definition is stated, the operation is checked to see if multiplication thus defined 
satisfies the algebraic properties of a field. 

As Rgure 4 shows, ?« '1 was overwhelmingly (over 60%) the most frequently used 
interpretation while dilation, rotation was the least used (not used by 67% of the teachers). 
The other two interpretations, ordered pair and definition of multiplication, were used 
frequently or infrequentiy by 45% and 63% of the teachers, respectively. (For Figure 4 and 
the remaining figures of this section, the interpretations are ordered from left to right in 
order of decreasing frequent use, which coincidentally is the same order as frequent/ 
infrequent use.) Overall, teachers used the same concept representation for all students 
rather than differentiating by ability. 

Figure 5 illustra^-^s positive reasons cited for using a particular concept representa- 
tion. These reasons foU w much the same pattern as the reasons for covering a subtopic, 
with one notable exception. As with subtopic coverage, the niunber of reasons cited corre- 
lates directly with the number of teachers who used the interpretation and the specific 
reasons fi^uentiy cited are content (uses prior, useful later) and external (text, syllabus/ 
external examination) with well Hiown again playing a supporting role. For concept 
representation, however, easy to understand and easy to teach frequently are dted, but 
they are not often dted as a reason for teaching a subtopic. As Figure 6 shows reasons cited 
for not using a particular interpretation largely were external (text, syllabus/external 
examination) and never considered with prerequisites unknown listed for dilation, 
rotation. 
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Figure 4 

Interpretations Used for Complex Numbers 
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Figure 5 

Positive Reasons for Using Interpretations 
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Figure 6 

Reasons WSy interpretations were NOT used 
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Logarithms 

The following four representations were considered in the survey (SEMS, 1985): 

Exponent Base. Logarithms are defined as exponents.. Students abstract the gener- 
alization from observing, and working with, patterns such as 

4x32=2^x25 = 2^ = 128 
here log ab = log a + log b is considered a restatement of 10* x lO** = lO**''. 

Inverse Function Base. A logarithuiic function is defined as the inverse of the 
exponential function 
fOc)=10» 

Consider the graph of the log function. It is obst nred for several specific problems that the 
ordinate at x^ab is equal to the sum of the ordinates at x = a and at x = b. Thus 
logflfe = logfl + log b. 
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Area Under a Curve 3ase. Logarithmic hinctions are defined in terms of area under 
curves of the form 

fOr) = F(k^) (y = log X is associated with It = 0.434) 

Log b is then defined as the aiea under the graph of Ox) for 1 <= x <= b. By covmting 
squares on a fine grid paper for several problems, students for [sic] the generalization that 
the area under tiie curve from 1 to a& is the sum of the area under the curve from 1 to « and 
from 1 to b. 




As with complex number representations, teachers by and large did not differentiate 
representations by student ability, and one particular representation dominated (exponent 
hue) and one representation rarely was used (area under curve) (See Figure 7). (For Figure 
7 and the remaining figures of this section, the representations are displayed in order of 
decreased use.) 
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Figure 7 

Time Spent on Each Logarithm Representation 
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Figure 8 illustrates the number of positive reasons dted for using each representa- 
tion. As seen before, there is a direct relationship between reasons cited and representation 
use. Paralleling complex representations, enjoyed by students and easy to teach were 
dted rarely and external (text, syllabus/external examination), content reasons (related 
prior, useful later), well known, and easy to understand were dted frequently. 

The reasons often dted for not using a logarithm representation also dosely parallel 
negative reasons for complex number representations. (See Figi're 9.) The negative reasons 
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Figure 8 

Positive Reasons for Using Representation 
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most frequentiy cited are: text, never considered and syllabus/external examination. As 
with the least used complex representation (dilation, rotation), the least used logarithm 
representation (area under curve) had prerequisites unknown frequently cited. 

Therefore, analogous to subtopic coverage, the data suggest that for a teacher to use 
a representation, not only must the reprrrentation be included in the syllabus or text, but 
the teacher must be familiar with the representation, the representation must be easy for 
students to understand, ttve students must know the prerequisites, and the teacher must 
know how the representation will be useful later. How much students enjoy a representa- 
tion or how difficult it is to teach are much less important to teachers in selecting content 
representations. 

Multiple Re presentaHona 

When teacher coverage of logarithms and complex numbers is compared with the 
number of subtopics covered for ea :h topic, numerous inconsistencies are found. For 
example, teachers did not mark a topic as covered even though they covered all four sub- 
topics. Consequently, for a icjug to have been covered either the teacher marked it as 
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Figure 9 
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covered or the teacher covered at least three of the four subtopics. In our examination of 
multiple representations and their characteristics, we include only those teachers who 
covered the topics of complex numbers and logarthms. 
C omplex Numbers 

The two multiple representations indices for complex numbers we examine are: (a) 
Complex Frequent and (b) Complex Used. For Complex Frequent we shall examine those 
teachers who frequently used: (a) at most one representation and (b) more than one repre- 
sentation. For Complex Used we shall examine teachers who used (either frequently or 
infrequently): (a) at most one represenMtion, (2) exactly two representations, and (c) more 
th^n two representations Table 1 lists the numbers of each. 




ERIC 



62 



Table 1 

Numbers of Teachers Usi ng Multiple Rep resentaHnnfi 



ITsedatmc tone 

representation frequently 
d more than one 

representation frequently 



Complex Logarithm 
Frequent Frequent 



77 



44 



84 

32 



Used at most one representation 

Used two representations 

Used more than two representations 



Complex 
Used 



33 
29 
59 



Logarithm 
Used 



34 
56 
26 



Total 



121 



116 



Lt^garithms 

For tlie complex number representations, teachers were specificaUy asked if they: 
(a) use this interpretation frequentiy; (b) have used Has interpretation, but infrequently; or 
(c) do not use this interpretation. For logarithm interpretations, however, teachera were 
asked the number of periods they shidied each interpretation. Therefore the construction of 
multiple representation indices for logarithms requires additional steps. Firet, class periods 
for each interpretation were converted into minutes which were then categorized as: not 
used, time = 0; used infrequentiy, 0 minutes < time < 75 minutes or about one period; and 
used frequentiy, time > 75 minutes or more than about one period. The results o/ this 
classification scheme are given in Table 2. 
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Table 2 

Q assificaHon of Logarithm Representations 



Exponent Inverse Area Under 

Base Function Curve 



Not Used (Time = 0) 5 35 84 

Used Infrequently 36 35 8 

(Time < 75 Minutes) 

Used Frequently 61 33 11 

(Time > 75 Minutes) 



Total 104 103 103 



While other classification schemes dearly exist this scheme has three main advan- 
tages: (a) it is reasonable (There was a natural break in the data between 58 and 80 minutes 
for each interpretation.), (b) it allows ua to compare and contrast Uiple representation 
use for complex numbers and logarithms, and (c) it helps to eliminate complicating factors 
such as time spent on one particular representation. 

Therefore, to paraUel complex multiple representations, the two multiple represen- 
tation indices for logarithms we shall examine are: (a) LogarithiLi Frequent and (b) 
Logarithm Used. For Logarithm Frequent we shall examine those teachers who frequently 
used: (a) at most one representation and (b) more than one representation. For Logarithm 
Used wf> shaD exanune teachers who used (either frequently or infrequently): (a) at most 
one representation, (2) exactl)' two representations, and (c) more ;han two representations. 
Table 1 lists the numbers of each. 

Major Results 

In this paper we examine only the major resulb^ of this analysis, that is, results ^hat: 
(a) were supported by statisticaUy significant relationships between at least two mulaple 
representation indices and a variable an^ (b) had additional support from at least one other 
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statisticaUy significant relationship between at least one index and another closely related 
variable. Other results are given in a tachnical appendix (Glidden, in press). 
Use and O evelopment of Supp lem ental Matg rial«i 

As Table 3 shows, there is a strong relationship (p» < 0.05) between multiple repre- 
sentation use and the use of previously self-develcped supplemental materials as sources of 
information on what to teach. As Table 4 shows, teachers \ ho used multiple representa- 
tions were also more likely to develop supplemental materials. This is especiaUy surprising 
given how few teachers developed materials at aU. Therefore, it appears, teachers who use 
multiple representations are more likely to develop and use supplemental n-ateriais. 

Tables 

RelaHon Between Complfx Indiros and Souires of Information aht.ut Coals and What 
Topics to Teach is Materials Previously Prepared bv Yourself 



Never Occasionally Frequently 

Used Used Used 



Complex Frea^ent* 

Frequent <= 120 40 12 

Frequent > 1 5 22 14 
Complex Used** 

Used<=l 10 19 1 

Used > 2 1 20 5 

Used > 2 14 23 20 



Note. '0(2,^=113) = . ^.<0.05 

(4, N =r 113) = 19.492, ^ < O.OC: 

Content Presentation 

As Table 5 illastrates, multiple representation teachers were more likely to use a 
minimum competency statement as a source of information on how to present a topic. 
Multiple representation teacl.ars also were more likely to use the syllabus (or curriculum 
guide) and textbook (See Table 6.) as sources of ideas for problems that go beyond drill and 
practice. When Tables 3, 5, and 6 are viewed together, it is apparent that multiple represen- 
tation teachers are more likely to use various resources ^self-developed materials, minimum 
competency statement, text, or syUabus) for ideas than are single representation teachers. 
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Table 4 

Relation Between Complex Usgd and Development of Supplementary Materials 



Developed Supplementary Materials 
No Yes 



Complex Used 

Used <= 1 33 0 

Used = 2 19 10 

Used > 2 41 18 



Note. CM2, N= 121) = 13.834, p< 0.001 



Table 5 

Relation Between Indices and Source of Information on How to Present a Topic is State- 
me nt of Minimal Competence 

Never Occasionally Frequently 

Used Used Used 

Complex Frequent* 

Frequent <= 143 20 10 

Frequent >1 13 16 12 
Complex Used** 

Used<=l 16 9 6 

Used > 2 19 5 2 

Used > 2 21 22 14 
Logarithm Frequent" 

Frequent <= 144 22 10 

Frequent >1 8 13 10 

Note. 'C (2, N =114) = 8.375, p< 0.05 

(4, N = 114) = 9.667, p < 0.05 
(2, N = 107) = 10.098, p < 0.01 
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Table 6 

Rel^tipn Bgtwg^n Com pJ eiLlndices and as a Sources of Information on Selecting PrnhlPmQ 
(e.g.. applic ations) that go Beyond Drill and Practice 



Never Occasionally Frequently 

Used Used Used 



Syllabus or Curriculum Guide 
(Other than Minimum Comp ^ency Statement) 

Complex Frequent* 

Frequent <= 129 34 9 

Frequent > 1 7 23 12 
Complex Used'' 

Used<=l 15 11 4 

Used > 2 10 14 3 

Used > 2 11 32 ^4 

Textbook 

Logarithm Frequent* 

Frequent <= 136 29 10 

Frequent > 1 8 22 2 

Note. 'C (2, N =114) = 8.704, 2 < 0.05 

"C^ (4, N = 1 14) = 10.087, p < 0.05 
(2, N = 107) = 8.148, p < 0.05 

A strong relationship was found between the complex multiple representation 
indices and number of minutes spent on complex numbers. (See Table 7.) This is especially 
noteworthy when we recaU that the majority of teachers used one logarithm representation 
and the time used for that interpretation varied from zero to 464 minutes. Therefore we 
would not expect our logarithm multiple representation indices to capture this relationship. 

As already noted, there are major differences between teachers in time allotted. But 
as Table 8 illustrates, multiple representation teachers are not only more likely to cover 
more theorems, but they also are more likely to give more formal proofs. Therefore, there is 
strong evidence that mulHple representation teachers spend more time on a topic and cover 
the topic more extensively than do nonmultiple representation teachers. 
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Table? 



Relation Between Complex Indices and Number of Minutes Spent on Complex Numbers 



Complex Frequent* 

Frequent <= 1 
Frequent > 1 
Complex Used^ 
Used<= 1 
Used = 2 
Used > 2 



Mins< 180 



15 
4 

11 
3 
5 



Minutes 
18C<=Min3 
<360 



28 
10 

8 
16 
14 



360 <= Mins 



24 
28 

5 

10 
37 



Note. (2, N= 109) = 9.994, i2< 0.01 

"C' (4, N = 109) = 27.930, p < 0.001 
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Tables 

Relationshin between ComplPY TnHi res and IniHal Teacher Presentation 



Gave Stated, Stated, Not Covered 

Formal Informal NoDeriv- or Not 

Proof Derivation ation Discussed 



Formula F(a + bi,c + di) = 
Complex Frequent* 



Frequent <= 1 17 

Frequent > 1 21 
Complex Ujed'' 

Used <= 1 2 

Used = 2 8 

Used > 2 28 



+ bd,c^ + d^) + F(bc - ad^^ + d^) i 



21 10 24 

16 3 4 

6 4 16 

8 5 8 

23 4 4 



F^ nula {r(cos Q + i siii Q)}" = r" (cos nQ + i sin nQ) 



Complex Used" 

Used <= 1 
Used = 2 
Used > 2 



2 
15 
29 



6 
5 

n 



5 
4 

7 



Note. 0, N= 116)= 13.160, p< 0.005 

(6, N = 116) = 34.151, p < 0.001 
(6, N = 117) = 20.852, p < 0.005 



16 

5 

12 



Finally, as Tables 9 and 10 illustrate, there is evidence that teachers who use mul- 
tiple representaHons for complex numbers are also likely to use multiple representaHons for 
logarithms. 
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Table 9 

i ^elation bet ween Complgx Indices and Logarithm Frcqugnt 



Logarithm Frequent 

Frequent <= 1 Frequent > 1 



Complex Frequent* 

Frequent <= 1 47 13 

Frequent > 1 20 16 



Complex Used'' 






Used<=l 


24 


3 


Used = 2 


18 


8 


Used > 2 


25 


18 



Note. 'C? (1 , N = 96) = 5.537, n < 0.05 

K.^ (2, N = 96) = 7.444, ^ < 0.05 

Table 10 

Relation Between Complex Used and Logarithm Usfd 



Logarithm Used 



Used <= 1 Used = 2 Used > 2 



Complex Used 

Used<=l 9 12 6 

Used = 2 12 11 3 

Used > 2 5 28 10 



Note. C (4, N = 96) = 11.029, j2< 0.05 

Teacher Experience and Education. 

As Table 11 indicates, there is a strong direct relationship between experience in 
teadxing mathematics and use of multiple representation. Additionally, Table 12 Ulustrates 
a statistically significant relationship between Complex Frequent snd the number of semes- 
ters of mathematics methods. A similar, but not statistically significant, relationship is 
present between age and Complex Frequent and Complex Used. 
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Table 11 

Relation between Indices and Numb er of Years Experience in Teaching Mathematics 

Yrs < 5 5 <= Yrs < 13 13 <= Yts 

Complex Frequent* 

Frequent <= 1 22 28 21 

Frequent >1 11 9 23 
Logarithm Used'' 

Used<=l 13 9 9 

Used = 2 14 22 17 

Used > 2 3 4 17 



Note. •C(2,lsl=114) = 7.063,p<0.0S 

•C^ (4, N = 108) = 15.090, p < C.C5 

Tab)£ 12 

Relation between Complex Frgqu«»nt and Number of Semesters of Mathematics Methods 
and Pedagogy 



Semesters < 3 3 <= Semesters 



Complex Frequent 

Frequent <= 1 37 34 

Frequent > 1 12 31 



Note. C (1, N = 114) = 6.403, p< 0.05 

Table 13 

Relation between Lof arithm Frgguont and Hpad n f Department 

Logarithm Head of Department 

Frequent Yes No 



Freq <= 1 37 39 

Freq > 1 8 22 



Note. C (1, N = 106) = 4.268, p< 0.05 
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However, because of the results shov, :^ in Table 13 we cannot assert that there is a 
strong relationship between teacher experience/education and multiple representation use. 
In Table 13, Logarithm Frequent is inversely related to Head of Department. This could be 
attributed to the construction of the Logarithm Frequent index or possibly even to chance. 
But, the data also show relationships < 0.05) between head of department and age, 
expedience teaching, experience teaching mathematics, and general education courses. 
Therefore, until ft ther analysis is performed, we shall say that there is evidence of a 
relationship betw. i teacher experience/education and multiple representation use. 

Summary 

External reasons (text and syllabus/external examination) and teacher familiarity 
(well known vs. never considered) xrequently were cited as reasons for and against teach- 
ing particular subtopics of complex numbers and logarithms. Additionally, content reasons 
(related to prior and useful later) frequently are cited as reasons why a subtopic should be 
taught. Qosely paralleling reasons for subtopic coverage, external reasons and teacher 
familiarity frequently were cited as reasons for and against using a particular concept 
representation and content reasons frequently ^^ere dted as reasons why a represenU^aon 
should be used. However, only for concept representation, easy to understand also was 
frequently cited as a reason for using a particular representation. For both subiopic cover- 
age and cone pf t ^presentation, easy to teach and enjoyed by students were not often cited 
as reasons, either pro or con. 

There were significant relationships between the use of multiple representations and 
teacher development and use of supplemental materials. There also was a relationship 
between multiple representation use and .*.ources of information used to decide what to 
teach, how to teach, and what applicatic ) to present. Together these relationships suggest 
that teachers who use multiple representatioa-j use more sources of information (self- 
developed materials, minimtmi competency statement, ^ext, or syllabus) than do nonmul- 
tiple representation teachers. 

Teachers who use m'*'tiple rq)resentations also allot more time for a topic and they 
are moie Hkely to cover important formulas and theorems more deeply than nonmultiple 
representation teachers* There was some evidence of a relationship between teacher experi- 
ence/education and multiple representation use, but further research is necessary before 
inferences can be made. 
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Possible Implicatioris for Mathematics Education 
The results discussed above, if supported by further re^rch, suggest several 
obvious implications for mathematics education regarding: development and xjse of sup- 
plemental materlab, the rdationship between sources of information and multiple repre- 
sentation use, and time allotted for coverage and depth of crverage. However, thwe is one 
less obvious implication that directiy affects teicher education. 

Recall Sherlock Hobnes's "curious incident" of the dog not ba.«ing in "Silver dlaze" 
(Doyle, 1893). The fact that the dog did not bark was an importcnt clue because it sug- 
gested that the dog knew the culprit. With respect to classroom process data, this analysis 
found no major relationships between multiple representation use and school data. Our 
data did not bark. Therefore, it appears thai representation usi is a local phenomoion, a 
function of teacher perception. That is, how familiar the teacher is with a representation, 
how easy it is for students to understand, how it relates to prior mathematics, and how 
useful it is for future mathematics. This perception may be influenced by the teacher's 
educational preparation and experience. This suggests that curriculum designers, supervi- 
sors, and mathematics edi ots should take special care to provide teachers with sufficient 
explanation of and justifica. n for important concepts and their representations. Teacheis 
make informed judgements regarding representation use (and .subtopic coverage) and 
mathematics educators should be aware of this. 
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Appendix I 

Short and Long Subtopic Tittes 

Logarithm Subtopics 

ghprt Title Long Title 

Laws of Logs Laws of Logarithms 

Graphing Lo^ Graphing Logarithmic Functions 

Natural Logs Natural Logarithms 

Log Applications Applications of Logarithms 

Complex Number Subtopics 

Short Title Long Title 

Complex Roote Complex Roots of Qusdratic Equations 

Complex on Rect Graphing Complex Numbers on 

Rectangular Coordinates 
Complex with Polar Polar Coordinate Representation for 

Complex Nimtbers 
DeMoivre's Theoram DeMoivre's Theorer ^nd Roots of Unity 

Similar issues in constructing explanatory indices from descriptive data are 
discussed in McKnight and Cboney. 1988. p. 4. 

There also is a statistically sigmficant relationship between teacher expertation 
of student mastery of log ^ i - y and log, x - z iff br - * and Logarithm Frequent 
However, because slighUy more than 10% of the teachers did not teach the for- 
mula, the fmding was not included. The chi-square natistic was sigmficant at the 
5X level for the relationships between teacher expectation of student mastery of 
the two formulas in Table 8 and Complei Used, but since severs! cells had ex- 
pected counts less thau results could not be inferred from ihe data. 
In fact, oaiy two significant relationships were found Hetween multiple reore- 
sentaUoa use and all t!ie school variables. 

This is not to «y that there may not be system differences in represenution use. 
McKnight and Gooney ( 1 988) found no dear. overaU patterns of mulUple repre- 
sentation use between systems, and there may be. and probably are. differences 
m preferred representetions between systems. Further research is required to 
determme if a comparable implication can be inferred about other systems 
There also was some evidence of a relationship between multiple representation 
use and the teacher's perception of class ability These results are discussed the 
Technical Appendix . 
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CREATING GENDER DIFFERENCES: 

A COMPARISON OF MALE AND ! EMALE MATHEMATICS 
PERFORMANCE IN NINETEEN EOUCATIONAL SYSTEMS 

Deborah Perkins Jones 
David P. Baker 

The Catholic University of America 
INlRODUCnON 

The search for differences between mr^es and fenudes across performance 
domains is a research activity undertaken by motl disciplines in tlte behavioral and 
soddl sciences. To a great extent^ differences (or similarities) between males and females 
are studied almost as by-products of phenomena in the pursuit of other theoretical 
interests - such as cognition^ physiology or social inequality. To a far lesser extent^ 
differences between the sexes are studied as part of a tneory of gender itself with an 
integrated set of hypotheses. 

We exanune several theoretical accoimts of gender differs uce; in one narrow 
performance domain. We compare^ to the limits of our data^ three accounts of gender 
iivfluences from three broad segments of the sex difference literature - sociological, 
social psychological and biologkral. The performance domaiit that we focus on - eighth 
grade mathematics ;>chievement - is both narrow in content and short in the duration 
of an individual's life. But it is a dom9in that has demonstrated consequences for a 
range of behavior and later life chances. 

We search for ^^ender differeiKes in mathematics achievement among 77,000 
students within 19 educational systems around the world. We test the degree to which 
patterns of gender differences (or similarities) conCnn central a^^sumptions underlying 
each of the three theoretical perspectives. 

THEORETICAL PERSPECTIVES ON GENDER DIFFERENCES 

Since a measure of a subject's sex is easy to incorporate into most studies, a vast 
sei of empirical findings about male and femal^ differences has been produced. Tne 
same holds true for theoretical consideration of gender. Every conceivable theoretical 
perspecti^^c on humdn behavior contains an accoimt of the origins of gender differences 
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across an array of domains. The resulting literature is immense and unwieldy, without 
even the crudest of a central paradigm for conceptual guidance. This makes it difficult 
tc place new evidence about males and females within a meaningful context. Faced 
with this kind of Uterature the task becomes one of theory reduction. 

In generating hypotheses we have Umited our consideraHcr. to three dusters of 
theoretical accounts of gender phenomena. These accounts represent central pools from 
which a large number o^ other theories flow. Also, each of these perspectives has some 
history of results in examining n^thematical abiUty between ^he sexes. In our models of 
each perspective we do not claim to exhaust aU of the numerous twists and turns of each 
theory, but rather we bring out data to bear on the central assumptions, the necessary 
conditions, of each of the general perspectives. 



The Sociology of Gender Differences 

Most sodological accounts of gender rest on the assumption Lhat gender roles are 
bom out of the institutic v-, within a society. At the center of this idea is the notion that 
institutions defire gender roles aiid that thes'j deflnitions become forged into a diffuse 
"gender beUef system" which shapes the day-to^ay behaviors and attitudes of men and 
women, and girl* and boys (Ke&j 5t Ferree, 1987). 

By a sodological accouni then, the genesis of gender roles are the institutional 
rules of being a male or being a female. OU>,er processes, more sodal psychological or 
even physical in nature, may transmit these rules to individuals, but at the heart of this 
perspective is the imagery of institutions forming rules about gender which in turn 
form the status of female or male within a sodety. A wide variety of institutions have 
gender-specific rules, such as rules of courtship and marriage, family organization, access 
t'^ political power, and access and conti ji over economic resources. 

Related to this notion is that as institutional rules vary across societies, genc.r 
status varies across sodeties. Gender is considered to be actively and socially 
constructed; it fe npt a^ U n mytaWe qyaUty . A central assumptior of this sodological 
image is that differences in the idative stahis of the two sexes will correspond with the 
relative differences in performances. In sodeties in which there is a large difference 
between the status of men and women, there wiU also be larj^e performance differences 
between men and women. In s'>deHes in which the relative status between the genders 
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is small, performance differences between the genders should also be small. The 
reasoning behind this assumption being that gender will play less of a role in 
determining the conditions of performance of individuals in societies where gender is 
used less as a stratifying quality. 

This is a nudn hypothesis of sociological explanations of gender differences in 
mathematical performance, but it has rarely been tested. A test of this hypothesis 
requires the kind of data we have - namely, for a sample of societies a measure of sex 
differences in mathematical performance and measures of status differences between 
males and females. We have tised a large aoss*national data set on mathematical 
abilities of 8th graders as our indicator of gender difference in performance across 
societies. We have add^d to that a variety of measures of the relative status of men and 
women across a range of institutions. We have no: attempted to form one global 
measure of gender status, but rather have selected indicators from several institutional 
dimensions. This a^-ov^s us to assess the reladve abOity of various institutions to shape 
gender statuses which might create gender differences in performance. 

Although we include indicators of general social status, we focus on economic 
indicators of gender status since differences between men's and women's access to 
financial resources and occupations seenis to be a key correlate of a general gender status 
(Blau Sc Fexber, 1986; Chafetz & Dworkin, 1936). Also technical training and preparation 
for occupational positions are linked through attitudes towards formal schooling. 
Given the perception of mathematics training as an occupational skill, the basic 
sociological argument suggests that within a society with weak gender barriers to 
economic participation, there should be less gender differences in performance. 

There is a related argument from the sociological perspective that we can 
examine. A number of global phenomena have resulted in limiting the degree to which 
social systems are structured (and stratified) by traditional attributes such as clan, family, 
ethnidty, and caste. The same case can be made for gender stratification as well. 

The full host of influences on this process are too numerous to describe here, but 
the core of the argument usually centers on the phenomena of nation-state buiWing and 
the process of creating citizens through formal schooling (Meyer & Hannan, 1979; 
Ramirez k 6oli-Berme:i, 1988). The argtunent ^oes that modem nation-states work to 
decrease traditional ties and increase citizen allegiance and participation. State 
sponsored '^stit'itions shoulder much of this task and chief among these institutions is 
fomial schooling. It is suggested that schools through their state-derived charter and 
structure mitigate against traditional forms of stratification. As a student body and a 
future dtizeruy, children are less likely to be stratified in schoo) by such qualities as their 
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sex. The tone of this argument is essentially historical. As the Western model of states 
and schools spreac, so did a decrease in the legitimate use of traditional mechanisms of 
stratification. The official implications of this trend can be seen in such governmental 
actions as the U.S. Title 9 prohibiting gender discrimination in school activities. 

If this argument is true, we should find that gender differences in school 
performance decrease over time.^ We can compare earUer national results of boys' and 
gi'-ls' performance on mathematics tests with the data we use to assess whether, as the 
conditions of gender stratification in a society decrease, so do gender differences in 
academic performance. 
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The Social Psychology of Gender Differ .es 

There are a variety of social psychological accounts of gender differences. At the 
heart of most of these is the notion that face-to-face interactions in i^arious social 
organizations influence the sexes in different ways thus yielding different performances. 
This basic scenario is very saUent in t e literature on gender and schools, in which a 
number of school factors are suspected of producing different experiences for males and 
females. These factors range from the imagery of a "hidden curriculum," which is 
thought to contain gender stratifying qualities, to more overt discrimination of access to 
educational opportunities (BeckrT, 1981; Brophy & Good, 1974; Fennema, et al., 1980; 
Leinhardt, Seewald & Engel, 1979; Morse k Handley, 1985). 

The basic argument in all of these perspectives is that males are given advantages 
over females for the mastery of mathematics in school. And that these advantages are 
social psychological in nature, or namely effects of face-to-face interactions (Aiken, 1976; 
Burton, 1986; Walden & Walkerdine, 1986). 

There are two surJi face-to-face schooling processes ~ within classroom 
interaction a_d family effects - which are often cited as causing gender stratification of 
pertonnance. Research on the former considers how teachers might teach differer tly to 



» ^f^*,7'*^^'*'°*er scenarios that generate essentially the same hi^^^^^ 
about gender effects as does the nation-state and citizenship formaHon perspective. These include 
»°!^r "y^"^'' "gumeiits, arguments about the effects of conflict over traditional 
stratification and the expansion of industrial economies that break down tradiHonal structures 
Oiat incorporate ger.der heavily into the scheme of stratification. Our analysis tests only the basic 
hypothesis about che decline in gender effects on performance, not hypotheses about each of thr 
nunwrous causal mechanisms that could play a part in this process. 
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male students and female students. And research on the latter considers how parents 
might influence their daughters and sons differently to achieve in school 

There is some aoss*national evidence to sugges'c that males and females are 
imiversally treated differently in the schooling process (Fmn, 1980). But also there is 
evidence to suggest the opposite, that formal education has become a force of gender 
egalitarianism, a place where females and males are treated similarly. 

Whether or not males and females are taught differently is a large question that 
can not be completely answered with just one "tudy. The data we use certainly does not 
contain measures of all possible gender discrimination that could occur during teaching 
in the classroom. It does, however, contain a measure of perhaps the most central of 
schooling processes determining performance, namely acces? to ouriculum, or a 
student's "opportunity to learn" (OTL). We can determine if there are systematic gender 
differences in the opportunity to learn mathematics in these 19 educational systems. Do 
males gain an advantage in mathematics by being in classrooms where more and more 
advanced mathematics is taught? Or conversely, are fenriales at a disadvantage because 
they are funnelled into classes where less and less advanced mathematics is taught? 

Family influences as a possible explanation of gender differences in mathematics 
have been considered from a variety of perspectives, such as early socialization, forming 
performance expectations and standards, modeling of behavior conducive to solving 
mathematical problems and social reinforcements (e.g.. Baker & Entwistle, 1987; Fox, 
Tobin k Brcdy, 1979). As is the case Li all social psychological accounts of gender 
differences, the family is suspected of treating sons and daughters differently in terms of 
instilling the necessary skills to do mathematics (Eccles & Jacobs, 1986). 

Since we do not have either direct family observation of parent-child interaction 
or parents* perceptions of their support, we do not focus on family effects in considering 
these social psychological arguments. We can however, ijiclude some investigation of 
the student's perceptions of their parents' encouragement to do well in mathematics 
and the student's attitudes about gender and mathematical training. We can assess the 
size of gender differences in these perceptions and attitudes and the relationship 
between these uifferences and performance differences across educational systems. 

We test two central hypotheses o^ a social psychological perspective. Rrst we can 
examine one "hidden curriculum" hypothesis, namely that boys receive more access to 
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mathemaHcal instruction than do girU and that this is a uniform pattern across 
educational systems. Second we can examine a more general socialization hypothesis, 
namely that parents encourage their sons' mathematical achievement more than their 
daughters' mathematical achievement and that this is a uniform pattern across 
educational systems. 



The Biology of Gender Differences 

Biological explanations of gender differences in mathematics achievement are 
ancient and varied, with the earliest speculation about cognition and gender differences 
dating back io Aristotle (Sherman, 1978). Current biological explanations reflect current 
core paradigms of biological tiiinking about an array of human performance, with 
accounts based on hormonal (Broverman, Klaiber, Kobayashi & Vogel., 1968) genetic 
(Bock & Kolakowski, 1973; Stafford, 1961) and neural structural effects (Uvy, 1976; 
Waber, 1979). 

Most biological theories rely on the relationship between spatial and 
mathematical ability. These theories argue that some biological characteristic 
(hormones, genes or brain staaicture) produces different degrees of spatial perception 
powers and this causes performance differences in solving mathematical problems. For 
the most part, these theories and the research (hat they spawn are relatively inductive.2 
They first assume that there are clear and consistent gender differences in solving 
mathematics problems and that the problem is to identify which biological factors, that 
are known to be distributed by gender, might account for the observed pattern of 
performance. Seldom, if ever, are the operative factors achially measured and tested 
against performance. This is partially because of the difficulty in measuring these 
factors, but equaUy it is because of the confidence in the inductive process behind much 
of this perspective. Consider, for example, Benbow and Stanleys' (1980) highly 
pubUdzed paper in Scima. They claim that since American male junior-high students 
out perform female junior-high sMdents on a diffiailt mathematical test, males must 
have superior matiiemadcal abiUty, which may in turn be related to greater male ability 
in spatial skills" (p. 1264). They ma.':e this claim with only the scantiest of evidence 
about the lack of other non-biological effects at work within their data and without any 



2 

See Star (1978) for a similar critique of research on gender differences and brain hemisnhare 
as3mrdnetiy. 
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direct measure of spatial skills. Their original claim may be correct, but they have not 
attempted to consider the extent to which biological and non-biological factors may 
shape the gender differences they observed. 

Our data do not contain measures of operative factors from any of the biological 
theories of gender and mathematical perfomiance. But with this data set we can 
extensively examine the core assumption behind the inductive chain of reasoning in 
these theories - namely, are there consistent and large gender differences in 
mathenuitical achievement across a sizable number of students from different 
educational systems in different societies? 

The degree to which the answer to this question is no, suggests a difficult obstacle 
for a general biological perspective on gender differences. A lack of consistent 
differences is not in and of itself a complete rejection of biological effects, since there are 
any number of gerotypic and phenotypic analogies to suggest that biological influences 
can be masked by environmental ones. But at the very least, a lack of consistent 
differences would question the inductive reasoning that seems to buttress so much of 
the biological research about these phenomena. 

Additionally a mixed pattern of gender differences would indicate the size of 
non-biological influenc<»s in these distributions. Shon of offering some theory of 
societal influences on biological factors, an inconsistent pattern of effects suggests a 
variety of social influences. 

Gend r Difference as the Dependent Variable 

Although most of the sex difference literature discusses phenomena in terms of 
individual differences between males and females, they are really investigating qualities 
of distributions. Except for a few gross anatomical characteristics, there is no evidence to 
suggest that all fenudes differ from all males on any dimension. What we actu ^Ily study 
is the distribution of one sex compared to the distribution of the other. We can examine 
how close the means are, or how spread out the distributions are relative to one 
another, and so forth. Thus we can then make probability statements about gender 
effects, such as "one's sex is likely to influence what one does or thinks or believes." 
These probabilities, however, are used merely to approximate individual qualities from 
aggregate qualities. The real currency of gender effects a^^ differences or similarities 
among distributions of male and female performances. Therefore, we use comparisons 
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of the male and female distribution of mathematical achievement from each system as 
our dependent variable. 



Data and Measures 

Data 

The data on mathematical achievement come from the Second International 
Mathematics Study (SIMS) sponsored by the International Association for the 
Evaluation of Educational Achievement (ISA). SIMS, col' -led in 1981, is a 
comprehensive assessment of school mathematics of over 77 students in the grade 
equivalent to American 8th grade. Originally 20 national units participated in the study, 
these included: French Belgium (BFR), FlenUsh Belgium (BFL), British Columbia (BRC), 
England (ENG), Finland (FNL), France (FRA), Hong Kong (HKG), Hungary (HUN), 
Israel (ISR), Japan OAF), Luxembourg aUX), The Netherlands (NTH), New Zealand 
(NSL), Nigeria (NGR), Ontario (ONT), Scotland (SCT), Swaziland (SWZ), Sweden 
(iJWD), Thailand (THA), and the United States (USA).3 

The units do not represent a random sample of all nations in the world, rather 
they chose to participate in the study and each had control over their sampling and 
administering of the study instruments. The sample, however, does represent a 
reasonable mixture of the world's nations, including developed and less -developed 
nations and nations from most geographical regions of the world.* The sample of 
nations also represents a diverse set of administrative educational practices (Stevenson 
& Baker, 1989). 

In each unit, a stratified, random sample of classrooms was drawn to the 
specifications of the guidelines developed by an international committee (Garden, 1987). 
The goal was to generate a representative sample of 13-year old students and schools in 
each educational system. A common mathematics test, minimally adopted for each 
country, was administered to these sampled intact classrooms at the end of the school 



We can analyze onJy 19 of these systems because the Flemish Belgium sample did not 
contain a way to match the student's gender to their test performance. R)r the two pro'nndal 
systems in Canada and the United Kingdom* we attempted to use province-level indicators of 
female statm where possible. Also the Japanese sample was of 7th grade students and the 
Nigerian samj^ was 9th grade students, since in both systems the national committees deemed 
tliat the test tapped the mathematics curriculum at these grade levels. 

See Jones (1989) fo », full description of the sample and the SIMS study. 
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year. The test was designed to tap a range of mathematics skills, including four specific 
skill areas and five substantive areas. The total test contained 190 items made up on one 
40-item core test and five rotated fonns containing the remainder of items. Both the 
core test and the forms have a similar mixture of items in terms of skill and substantive 
areas. Each student took the core test and one of the rotated forms. Since all students in 
the study received the identical 40 item core test, we use just these items in our analysis. 
Additionally each student was asked to complete a questionnabe inquiring about their 
gender, their attitudes towards mathematics and their perceptions of their parents 
involvement in their preparation for mathematics. 

Teachers of the sasi^ple classrooms were also given a questionnaire about what 
and how they taught mathematics to the target classroom. For each item on the test, 
teachers were asked to report if they had taught the information needed to answer the 
test item. This is the so-called Opportimity to Learn measure. See Appendbc A for 
educational system, student and classroom sample sizes. 

For each of the national units in the sample we collected a number of measures 
of gender status, economic development and the size of the school system. These came 
from published sources including: Unit^ Nations Demographic Yearbooks, UNESCO 
Statistical Yearbooks and Population Reference Bureau publications (Kent, Huab & 
Osaki,1985;Sivaid, 1985). 

Measures 

The gender difference on the core mathematics test for each educational system 
were calculated as the male mean score minus the fenuile mean score. The individual 
scores from which the means were constructed were calculated as follows. Each core test 
item was a multiple choice with five optional answers. A core test score was computed 
using an estimated number know equation (Gulliksen, 1950). 

CoreScore = SR-(SW/4) 

where: S R = number of items correct 

S W = number of items incorrect 

This scoring corrects for any effects of guessing. 
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The OTL for each teacher was calculated for each item and was merged onto the 
student records which enabled us to e. amine student level gender differences in access 
to mathematics instruction. Some O: L analyses were done at the classroom level using 
just the teacher file which included the gender breakdown of the classroom. 

We use foxu- indicators of women's status in non-econom'c institutions in each 
country. These include: 

Fertility Rate - measured as the average number of children a 

women would have duiing '.ter lifetime at current birth rates. 

Percent Female Use of Contraceptives - measured as women in 
marital or consensual imion, aged 15-49, using modem 
methods as defined as the pill, lUD, sterilization, condom, 
diaphragm, foam and other barrier or chemical methods. 

Percent oi Females aged 15-19 Mauled. 

Number of Females in the national Legislation - Women both 
elected and appointed to legislative bodies. 

We use sbc indicators of female status in the labor force for each country: 

Percent Female in the Labor Force - as a percent of total labor force. 

Percent Female in the Industrial Sector of the La ox Force. 

Percent Female in the Service Sector of the Labor Force. 

Percent Female in the Agricultural Sector of the Labor Force. 

Gender Occupational Segr^ation Index - the degree to which 
females and males are concentrated ii\ separate occupations. 

Ratio of Female to Male Earning - averaged over all jobs. 
Results 

Table 1 presents t^.j sex differences for each educational system on the 40-item 
core test. In the third column are the differences themselves (the male mean minus the 
female mean). Standard biological accounts of gender differences and numerous 
empirical studies suggest that males will outperform females on mathematica. tests 
(Aiken, 1976; Backman, 1972; Benbow k Stanley, 1980; Maccoby & Jacklin, 1974; Mullis, 
1975). This is not the case in these data. Instead, the differences fall into three distinct 
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categories. In the first category are seven systems in which males do bc.ter than females. 
In the second category are eight systen^ in which there is r.o significant difference 
between the sexes, And in the third cateeory are four systems in which females do better 
than males. There is also no evidence tc iggest that the absolute size of a gender 
difference fovors either sex. The absolute mean differences among the systems in the 
first group is 1.50 and in the third group it is an ahnost identical 1.51. The very small 
country-level mean difference of 30 reflects this nuxture of gender differences across the 
systems. Also, a country-level mean difference weighted by the sample sizes shows less 
than a one-half of one item advantage for males (X = .49). Finally, it appears that 
mathematics performance is stratified less by gender than by educatioiui^ systems. Here, 
as in earlier comparative mathematics studies (Hus6n, 1967), between-system differences 
are substantially larger than within-gender differences in any one system. These 
analyses offer little support for theories of gender that assume a consistent and uniform 
pattern of performance differences between the se <es. 

Some of the more recent biologically grotmded investigations of gender 
differences, however, have suggested that \mifcrm differences will be most prevalent 
among the most difficult of mathematical areas. This is the male advantage hypothesis 
on so-called "higher order vhinking" (HOT) involving spatial relationships, encouraged 
by the results that Benbow and Stanley (1980; 1983) rq>ort. 

Although the core test was designed to tap a range of mathematics skills, we can 

examine the most difficult items to assess the HOT hypothesis comparatively. Within 

each system we determined the ten core lest items which were most frequently 

answered incorrectly and then calculated the sex difference on these items for each 

system.5 These differences a e presented in the fourth column of Table 1. 

In 12 out of the 19 systems the average male performed better than the average 
female on the 10 m^$t difficult items on the core test. Also in no system did females 
significantly out perform males, as vas the case with the full core test. Tnis pattern 
lends some credibility to the hypothesis that males have an advantage in performing 
difficult mathematical problems, although the country-levd un-weighted mean 
difference (.31) and the weighted mean difference (.38) are both relatively small. There 
were, however, seven systems in which this pattern did not hold and among these are 



Because performance and the teaching of a mathematics curriculum varie\i so much between 
s> stems, we calculated the ten most difficult items within each system instead of across all 
systems. 
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Table 1 



lEA Second International Mathematics Study 8th Grade Core Test Score Means by 
Cender. ' 



Cotmtry 



L Xm>Xf 

Frai^ce 
Israel 

Luxembourg 
The Netherlands 
New Zealand 
Ontario, Canada 
Swaziland 

n Xm=Xf- 



m. Xm<XF:: 

Belgium-French 
Finland 
Htingary 
Thailand 

Country Mean (N=:19) 
Standanf Deviation 



Male 
mean 



Female 
mean 



l/.J2a- 
18.79 
1334 
2Z00 
14.60 
17.72 
9.29 



14.18 
17.74 
11.74 
20.23 
13.51 
10.94 
7.89 



British Columbia 


\^55 


19.27 


Fngland/Wales UK 


1538 


14.92 


Hong Kong 


1659 


16.09 


Japan 


23.84 


23.80 


Nigeria 


9.50 


9.05 


Scotland UK 


16.83 


16.68 


Sweden 


10.70 


11.18 


USA 


14.98 


15.12 



Difference 
(MX-FX) 



2.84» 
1.05» 
1.60* 
1.77» 
1.09"» 
.78"» 
1.40" 



.28 
.46 
50 
.04 
.45 
.15 
-.48 
-.14 



6ifference on 10 
most difficult 
items (MX - FX) 



1.00» 
.36* 
.58* 
.66* 
.69* 
39* 
2S 



.41* 

.30» 
.30* 

.3r 

.04 
24 
-.07 
.27* 



19.44 


20.54 


-1.10" 


a3 


1324 


14.87 


-1.63" 


.05 


2236 


23.62 


-1.26* 


-.01 


1109 


14.16 


-2.07" 


-.14 


16.17 


15.87 


30 


31 


4.24 


437 


1.24 


29 



S jres on 40-Item Core Test were calculated as R - (W/4), where R is number of 'teinr 
correct and W is number of items incorrect. 

F ratio has p$ .01 
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the four systems in which females did better than males on the full test. So there is 
cotmtering evidence to suggest that even among the most difficult items there is not a 
imiform pattern of gender difference. 

Before continuing the analysis we stop here to consider one major source of 
possible bias in these estimates. The SIMS sampling focused on the grade in which most 
13 year-oldchildren were enrolled (i.e., the 8th grade). But not all of these educational 
systems are like the United States, in which nearly all 13 year-old males and females are 
in school in the same grade and thus yielding a nationally representative sample 
comparable for both sexes. If schooling in a particular system is selective for 13 year-old 
students and if this selection is somehow related to the gender of the stud ?nt, then this 
could cause a biased comparison between male andfemale students from one grade 
level While it is difficult to obtain precise estimates for each country of the percentages 
of the 13 yeir-old children, by gender, who are enroUec in the same grade level, we can 
make some rough estintates from which to judge any bias. 

Fortunately most systems in the SIMS sample appear to be like the U.S. In only a 
few systems is there a possible comparison bias created by the structure and selectivity of 
schooling. These few cases are interesting to consider* Take for instance, France, in 
which there are substantially fewer males than females in the 8th grade school 
populations (and hence in the SIMS sample, see colimm 3 of Appendbc A). This is due 
to a number of factors, chief among these is that over one half of French students repeat 
a year of school, and more boys than gi/ls do this. In France repeating a year is often 
used as a proactive device to add an additional year of preparation for entrance to more 
difficult and prestigious technical secondary school streams (such as the "C- 
curriculum"), and boys apparently use this strategy more than do girls. Thus large male 
advantage in mathematics knowledge in the French sample may be upwardly biased, as 
we are comparing a smaller, slightly older and perhaps better prepared male population 
against a more general female population. 

The reverse may be true in Nigeria in which there is low primary school 
enrollment in general (51% of an age-cohort in 1975) and male students outnumber 
female students by 2 to 1. The fact that we find no difference between the male and 
female means in mathematic ability probably underestimates male performance since 
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we are comparing a broader population of Nigerian males against what is most likely a 
more sdective group of I^gerian females. 

By this type of reasoning, we estimate that some organizational bias may be 
involved in only five cases. The idiosyncratic structure cf each case is too lengthy to 
describe here. We estimate, however, that among countries with a male advantage in 
mathematical achievement, certainly France and, to a lesser extent, Luxembourg and 
The Netherlands are upwardly biased. Among countries with parity between the sexes 
in achievement, Nigeria, as described above, may underestimate male performance 
And among counhies with a female advantage, Thailand is probably upwardly biased, 
but only to a snudl degree. 

We next examine males' and females' access to mathematical instruction in the 
8th grade. Table 2 presents the gender means and differences for OTL for the core test 
items in '4 of the systems.6 The third column presents the gender differences in OTL. 
A central assumption of most social psychological accounts of gender differences in 
school settings suggests that through various mechanisms males have more access to 
mathematical inshnction than do females, and this difference in access causes gender 
differences in performance This assumption, however, does not receive much support 
from the gender differences in 8th grade OTL in the SIMS data. In feet, in one-half of the 
educational systems giris receive more mathematics instruction than do boys. And 
there is no difference in the fuU sample between male and female OTL means. There is 
also no correlation (r = .09) among the systems between gender differences in core test 
performance and gender differences in OTL For example, among systems in which 
males perform better than females there is a mixed pattern of gender differences in OTL. 
Furthermore, in analysis not preserited here, there is no evidence to suggest that 8th 
grade boys have more access to different or more difficult substance areas (arithmetic, 
geomehy, algebra, measurement and probability) than do girls (Jones, 1989). 

Although there appears not to be a male advantage in terms of access to 
instruction there may be other, subtie, ways in which one gender is given an advantage 
over the other. The so<alled "hidden curriculum" perspective suggests that 
stiatification within schools occurs through a variety of face-to-face mechanisms, some 



Five systems did not coUect OTL, but fortunately these systems are evenly distributed 
across the categories of gender differences, with one (Israel) from the male advantage cat«^v.y, two 
(England/Wales IJK and Hong Kong) fron- the no difference categoiy and one (Bel&um-French) 
from tt»e female advantage category. 
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very subtle and others more manifest* Since the students were sampled by intact 
cbssrooms, we can examine one such **hidden curricultun" hypothesis. Namely, that 
teachers alter the amount of mathematics they teach as a function of the gender 
composition of the class- This hypothesis flows from a number of "hidden curriculum** 
arguments which suggest that teach^, willingly or otherwise, take part in the social 
stratification of the schooling process. 

The last column in Table 2 reports the unstandardized regression coefficient from 
regressing OTL on the percent female in the classroom. A negative coefficient indicates 
that teachers within a particular ^tem decrease the amount of mathematics 
instmction as the number of female students increase. This is the case in only one 
system (The Netherlands). In the majority of systems the number of girls in a classroom 
has no effect on the amount of mathematics taught, and in three systems (the USA 
included) teachers teach more mathematics when there are more female students.^ 

Classrooms in these 19 educational systems appear to be g^ierally equalitarian in 
terms of males* and females* access to 8th grade mathematics instruction. There is little 
support for the notion that schools manifestly hxtdt classroom opportunities in 8th 
grade on the basis of the gender of the student Fmally what variation there is between 
gender and OTL is not related to the mixed pattern of gender differences in performance 
that we report in Table 1. 

We next turn to several sociological explanations for the pattern of gender 
difference reported in Tables I and 2. A central notion of soaological perspectives on 
gender is that the relative status of men and women will influence sex differences in 
actions and attitudes. To the degree that a scdety*s institutions create status differences 
between men and women, gender will be a stratifying characteristic. If this explanation 



Carry the liidden curriculum** notion further, one could aigue that because female students 
tend to b? better behaved in class (Entwisle k Hayduk, 1979), teachers with more female students 
can teach more of anything^ mathematics included. So that the generally positive coefficients here 
do indeed represent a type of "hidden effect** of gender. To test this we added to the equations in 
TaUe 2 the teacher's estimate of the time spent on keeping order in this class. The effects of percent 
female were not diminished hy adding this variable; teachers do not alter the amount of 
mathematics taught because more girls in class means better behaved students. If these positive 
coefficients represent a gender effect here, ib underlying cause is not dear to us. We have also not 
separated out sin^e-sex classrooms fix>m this analysis, which could produce different gender effects 
ftx)m mixed-sex classrooms (Riordan, 1989). 
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Table 2 

Acce» to Mathematics Instruction (OTL) in 8th Grade by Gender for Skills needed for 
Core Test Items. 





Male 
mean % 
Core 
OTL 


Female 
mean % 
Core OTL 


Difference 
OTL Core 
(M-F) 


Unstandardized 
coefficient from OTL 
regressed on % female 
in class (Standardized 
error) 


I The Netherlands 


3927 


35.68 


4.04» 


-.05* 


(.02) 


France 


85.05 


85.42 


-.37* 


j03 


(.02) 


vyilulilU/ S^JallSiQSi 


80.79 


80.76 


.03 


-.U4 


(.11) 


Luxembourg 


5924 


M 77 


0 n't** 


.03 


(.05) 


Swaziland 


65.60 


67.13 


-1.53 


26 


(.20) 


New Zealand 


67.95 


68.99 


-1.04 


.04 


(.04) 


n. USA 


7750 


78.82 


-1.32»» 


.26** 


(.08) 


British Columbia 


25.55 


27.07 


-1.52»» 


.27* 


(.13) 


Japan 


81.43 


81.43 


0.00 


.02 


(.09) 


Nigeria 


73.19 


76.43 


-3.25** 


.07 


(.09) 


Sweden 


5255 


53.15 


-.60 


.09 


(.06) 


nL Finland 


63.15 


64.07 


-.92** 


.14* 


(.06) 


Hungary 


48.61 


4934 


-.73 


.04 


(.?2) 


Thailand 


84.84 


84.46 


38 


-.01 


(.04) 


Coimtry Mean (N=14) 
Country Standard 
Deviation 


65.6543 
17.9766 


65.2871 
18.3003 


-.6329 
1.6384 






* Calculated F rat^x> has p< .05. 

♦ » calculated F ratio has p< .01 
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of gender phenomena is correct, we should find that the relative status of men and 
women will effect even a specific performance domain such as mathenuitics ability. To 
test this we examine the relationship between the relative status of females and males in 
a society and the size of the gender difference in 8th grade mathematics in the 19 
educational systems in the SIMS data. 

We begin with four indicators of general social status of females in a society. The 
correlations between these and the size of a society's gender difference in mathematics 
performance are presented in the first column of Panel A in Table 3. Contrary to the 
broadest interpretation of a sociological perspective on gender, general status of females 
is not related to the size of gender differences in performance of mathematics. Control 
over reproduction and marriage dearly are unrelated to performance differences. 
Political incorporation does show a modest association in the predicted direction, but the 
coefficient is not statistically significant 

In the first column of Panel B in this table we exantdne six variables tha* reflect 
various aspects of the integration of women into the institution of work. These 
indicators of the occupational status of females are related to the size of the gender 
difference in core test performance among the society's 8th grade students. In systen\^ in 
which higher percentages of women work in the formal workforce, girls are more likely 
to perform as well or better than boys in mathematics. There appears to be a sector effect 
as well, with female participation in lower status agricultural work being less related to 
gender performance than female participation in higher status industrial work. 
Although the correlations for an index of occupational segregation and the ratio of 
female wages to male wages are not significant, each is in the predicted direction. All of 
these associatioiis remain stable even after controlling for general economic 
devdopment of the country (GNP) (analysis not reported here, see Jones, 1989) and 
many are statistically significant regardless of a small sample. 
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Tables 

Con-elatf ons Between Countiy-level Indications of Female Status and Gender 
Differences in 8th Grade on the Core Test, OTL and Attitudes. 





Gender 


Difference dM-hi on: 






40*Item 
Core 


OTL 
(N=15) 


Parental 
Encouragement 


Agree that Boys 
Need more Math 




(N=18) 


(N=19) 


Panel A: 

Women's Social Position 












.08 


-.42 


-31 


-.12 


% Female Use 
Contraceptives 


.02 


.44 


.09 


-.84* 


% Female 15-19 Married 


-.14 


-.49 


-.19 


.89** 


# Female in National 
Legislation 


-.32 
(N=16) 


.06 


30 


.25 1 

.25 1 


Panels 

PartidpaHon 










% Female Labor Force 


-.55* 


-.27 


-.61»» 


-.20 


% Female Industrial 


-.59* 


.01 


-.42* 


24 


% Female Service 


-.40* 


-.12 


.12 


35 


% Female Agricultural 


-.24 


-.28 


-.42* 


-.21 


Gender Occ Segregation 


33 

(N=:8) 


.06 


30 


.68* 
(N=8) 


Femaie:Male Earnings 


-.24 
(N=ll) 


.18 


-.15 


-.47 



P>.05. 
P>.01 
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In the second column of Table 3 are the same correlations for gender differences 
in OTL. As we have already shown/ there is considerably less range in gender 
differences in OTL than there is in perfomumce and so we would not expect adult 
gender status to vary greatly with access to mathematics instruction. Unlike in the case 
of performance, gender differences in OTL are not associated with indicators of females* 
participation in the labor force. The indicators of female sodal status are also not related 
to gender differences in 0TL.8 

On a five point Likert Scale, with five as agreeing the most, there is considerable 
variation in the size of the gender difference on parental encouragement in 
mathematics. The overall mean is 1.4 (SD 5.9), with about a third of the countries 
having a female advantage, a third with parity between the sexes and a third with a male 
advantage. These system-level differences in perceptions of parental encouragement are 
associated with the performance gender differences (r = .47, p .02), so that countries that 
yield gender differences in performance also yield gender differences in perceptions of 
support by parents. These differences in parental support are also related to gender 
differences in OTL (r = .56, p .02). Table 3 shows that as with gender differences in 
performance, differences in parental encouragement are not related to indicators of 
general female status, but are related to indicators of fenude participation in the labor 
force. Systems in which fonales have more access to the labor force are systems in 
which there is less of a male advantage in parental support, and girls may even be more 
encouraged to do well in academic mathematics. 

Differences between boys* and girls* agreement with the statement that *T)oys 
need mathematical training more than girls** are heavily in favor of males agreeing 
more than females with a sample mean difference of 10.6 (SD 11.5). The gender 
differences on this attitude are not related, however, to either differences in test 
performance or OTL (r = -.19 and r = -.14, respectively). And generally these differences 
are not related t > female status, except for three indicators. In systems in which higher 
proportions of females use cv ntraceptives, the gender differences in this attitude are 
smaller, but in systems with more young women nuirrying the gender difference in 



* The modest, although non-significant, correlations between OTL and fertility rate, 
contraceptive use and youth marriage are all in an unpredicted direction. This is largely due to the 
fact that The Netherlands, a country in which females have a more equal sodal status, has the 
largest male advantage in OTL and Nigeria, a country with considerably less parity between the 
sexes, has the largest female advantage in OTL. 

Er|c i03 
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favor of males agreeing is larger. Systems which yield higher levels of occupational sex 
segregation, also yield larger differences in the way boys and girls view gender and 
mathematics training. 

The above analysis is cross-sectional. A related, but longitudinal, test of a gender 
sociological perspective on gender would suggest that the world over time creates social 
systems less structured around traditional attributes. Schools, and performance by male 
and female stu^'ents within them, should reflect this change and thus gender difference 
in mathematics performance should decrease over time. To test this hypothesis we 
compared gender differences among 8th grade mathematics performance almost two 
decades apart in the nine countries that participated in both the First International 
Mathematics Study (FIN4S) done in 1964 and the SIMS in 1981.9 

The data presented in Table 4 supports the notion that there has been a decrease 
in the size of male superiority in 8th grade mathematics over the two decades. The 
sample mean drops from an ahnost 4% male advantage in 1964 to ahnost complete 
parity between the sexes in 1981. The individual country means show how this has 
happened. In 1964, all but one of the countries had a distinct male advantage mean 
difference. By 1981 four of these countries dropped substantiaUy toward parity between 
the sexes. This trend has been noted in other data from just the USA (Kolata, 1989). 
Two countries (Belgium and Rnland) actioally replace a male advantage with a female 
advantage, a trend that runs counterto a strict interpretation of the hypothesis. Lastly, 
two countries have different patterns of means. Israel, the only country in 1964 with a 
female advantage, has a male advantage by 1981. And France's modest male advantage 
20 years ago has strengthened over time.lO 

A further test of this notion of a deaease in gender differences in performance 
over time is to see if this is related to a change in the relative status of adult males and 
females over time. In other words, a general sociological perspective argues that as 
females gain more status relative to men in a society, performance differences between 
males and females decrease. We focus on changes in occupational status in all labor 



The HMS study was very similar to the SIMS in sampling, measurement and design. The 
core test was 30 items longer in the FIMS so we calculated a mean percent difference for each 
country. 

In Israel this may be due to a sizable influx of Sephardic immigrants, since the early 1960s, 
who are more tra litional in their use of gender as a stratifying quaUty. The results for France may 
be due to similar circumstances around increased immigration from Arabic societies. 
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Sectors. Figure 1 plots the change in female participation in the workforce from 1960 
to 19S0 against the change in gender differences in 8th grade mathematics from 1964 to 
1981 for the idne cotmtries that participated in both international studies/. 

Most of the countries show the predicted relationship* Seven of the nine ;>ystems 
are in the upper most quadrant of the graphic with hicreases hi female labor force 
participation associated w-th decreases in superior male mathematics performance over 
two decades. The relationship, however, appears not be strongly linear. The small 
sample of cases precludes standard tests of significance, but a non-parametric test of the 
ranking of the two variables yiekis a statistically significant relationship between the two 
vail^bles. There ar>3, however, several outlier in Bgure 1 worth noting. First, Sweden 
sitows less of a decrease in performance differences than its relatively large increase in a 
femTile labor force would predict. In part this may be due to the fact that the gender 
difference in 1964 was, like the U5., already snuOl. Secondly, both Israel and France go 
iigainst the general trend by yielding an increasing male advantage in performance from 
1964 to 1981. 

DisLnisston 

At the core of biological theories of gender performance differences on cognitive 
tasks are assumptions about tmiversal and consistent gender differenc.j in performance. 
The basic approach that is often used while testbig biological hypotheses relies heavily 
on this assumption. Our results, however, provide little support for ^^^s central 
premise. Gender differences in 8th grade mathematics are not universal, nor are they 
uniform. There appears to be substantial variation by educaticr^al system as to the size 
and direction of gender differences. Many countries have no discemable differences and 
in countries with differences, males do not always have the superior performance. And, 
although there is some evidence to suggest that males do better than females on the 
hardest of mathematical problems, this tendency is not universal, as a sizable group of 
educational systems show no clear male advantage on these items. 



We used various other sector combinations, such as non-agricultural, and found a similar 
pattern of results as those reported here for the more general indicator of female participation in 
all sectors of the economy. 
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TABLE 4 

Comparison of Gender Differences in 8th Grade Mathematics Between the Hrst 
International Matliematics Study (FIMS, 1964) and the Second International 
Mathematics Study (SIMS, 1981). 





Mean % di^ence 
Males - r emales 




Country 


1964a 


198lb 


Belgium^ 


6.43 


-2.75 


England/Wales 


5.36 


1.15 


Finland 


4.07 


-4.10 


France 


4.29 


7.10 


Israd 




2.63 


Japan 




0.10 


ine iMecneriancis 


6.64 


4.43 


Sweden 


2.57 


•0.01 


United States 


1.0? 


•035 


Country Mean 

Cotmtiy Standard Deviation 


3.94 
239 


0.91 
3.45 


T 

(df) 

One-tailed test p= 


2.15 
8 

.032 





a Adopted firom Husto (1967, p. 240), percentage estimated number 
known of 70*item Core Test for population lb. 



h Percentage estimated number known of 40-item Core Test 

c Belgium sample in 1964 is from the entire country, and the 1981 
sample uied here is from the Bel^uni-Frerch proportion of the 
country 
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Our analysis does not include information about the main operative foctois in 
biological arguments on cpnder. We do not measure spatial ability, hormonal, genetic 
or neurological effects and our results are not incompatible with a biological 
interpretation that would suggest that underlying universal gender effects are masked or 
enhanced in various situations. Our results do suggest though a ftrndamentiJ task for 
the biological approach. Namaly, if there are biological effects, they must be measured 
directly and their size relative to non-biological must be assessed. It is not enough to 
assume tl»l universal gender effects exist in mathematics performance at the 8th grade 
on the basis of results ftom students in one educational system. And it is not enough to 
merely search for biological factors that correkte with gender as an explanation for 
assumed performance differences. Until all of these pieces are pulled together into a 
unified approach we will know little about the existence of bio'«»ical influences on the 
creation of gender differences in mathematical performance. 

This variation in the pattc -n of gender difference suggest that there are sizable 
social influences in their o«ation. Our analysis has explored severa' explanations for 
t\ese phenomena. 

We have shown that schoob are generally equaUt?dian In terms of boys' and 
girls' access to training in mathematics at the 8th grade level. Contrary to a central tenet 
of a social psychological approach, which suggests that the sexes are treated diffeienfly m 
school and that boys are often fovored, boys do not receive more training in 
mathematics. And in some systems girb actually receive more training on the average 
than do boys. We do not have data on other central processes that make up the ' hklden 
curriculum" perspective For instance, we do nrt know if within classrooms, teachers 
teach differenUy to female students than they do to male students and so forth. Nor do 
we have measures on a host of other face-to-face processes which could be stratified by 
gender, such as the effects of guklance counselors for example (Fox, et al, 1979; Pietrofesa 
& Schlossbcrg, 1977; Shafer 1976). But to the degree that our extensive measure of OTL 
taps general access to mathematics, schoob do not seam to favor boys by teaching them 
more mathematics than they teach girb. 

Lastly, our analysb has yiekJed some evidence for a sociological perspective on 
gender differences, although the data suggest that the sociological process is not as 
general as it b often assumed to be. While societal level indicators of gender parity in 
the labor force are generally related to gender difference in 8th grade mathematics 
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performance, other indicators of gender status are not related to performance 
differences. The effects of gender status on performance seems to be domain specific. If 
a society incorporates more women into the formal labor market, its studmts exhibit 
less gender differences in school mathematics perfonriance. But if a society incorporates 
women into other domains, this may or may not effect gender differences in 
performance. 

Gerder status is not monolithic aaoss all institutions within a society. Rules 
about gender vary across institutions within the same society; and the degree to which 
one institution is connected to another will shape how much or how little gender will 
play a part in the roles under joint control of these institutions. 

Schooling and the labor market are strongly connected in most societies. To the 
degree that school is an institution of prej>aration for the work place, our findings verify 
a sociological creation of gender performance differences. In systems in which girls have 
more of an option to enter the labor force, their performance on mathematics is more 
similar to boys. The social process Vvehind this phenomenon is hinted at through our 
analysis of the student's perception of parental encouragement to study mathematics. 
Systems with more gender parity in parental encouragement are those with more 
females in the labor market. Also gtnder parity in parental encouragement is related to 
gender parity in performance. In societies with labor force opportunities for both men 
and women, parents encourage both thdr sons and daughters to st'.d/ mathematics and 
both boys and girls do this. These result:, suggest that social opp irti ides (or barriers to 
opportunity) resonant down to performances of actors within so ial systems. 

Further we found that, as a general modernity hypothesis would predict, gender 
differences in mathematics performance have decreased over time in nine educational 
systems scattered around the world. This parallel earlier evidence to suggest that 
gender has become less of a barrie/ to access to mathematics and science instruction in 
8th and 9th grade (Keeves, 1973). 

The incorporation of women into a wider sphere of economic participatl'^n in 
many societies has been the result of a number of processes, chief among these being the 
expansion of schooling on a Western model a- the state's breakdown of traditional 
modes of social stratification. Besides the economic benefits of this process (Benavot, 
1989), the belief that its full adult population is a nation's chief economic resource has 
become a standard political notion. Witness the recent publicity about the "crisis" ov jr 
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the decline in American students who wiU enter scientific and technical training and 
the calls for the expansion of training in these areas to a full range of students Cohnston 
k PiKker, 1987; Walker, 1938). 

These kinds of process's seem to fUter down to the interests and actions of 
individuab. The fact that most of the nine educational systems for which we had 
longitudinal data exhibited dramatic declines in male advantages in mathematics in less 
than 20 years indicates the potency of these social effects on individual behavior. 

There are a number of reasons to be cautious about making too sweeping a 
conclusion from our results as to their bearing upon central assumptions of theories of 
gender e£fects on performaiKe. 

First, the SIMS data includes only one subject Similar analyses should be done 
for other academic subjects. Particularly those subjects, such as reading, for which girls 
have been thought to Y /e an inherent (or otherwise) advantage over boys (Maccoby & 
Jacklin, 1974). This wou!d broaden a comparative treatment of gender effects. For 
example, a sociological argument would suggest that a decrease in gender stratification 
wouW also reduce gender differences in reading. Also, analysis of subjects that depend 
on mathematkal abiHty, such as sdwice, should be done to verify the findings we report 
for mathematics. 

Secoiui, we have concentrated only on 8th grade performance. A comparative 
analysis should be done on other school levels. This is particularly important for 
secondary schooling, for which a number of hypotheses ex'st about gender effects on 
curriculum tracking and choice of subjects that can influence performance factors. 

Third, in examining the assumptions behind sociological accounts of gender we 
have focused on economic and general social statuses of women and men. Other 
institutions need to be ccnskiered. For example, within schooling itself, certain 
institutional arrargements can foster status parity or differences between the sexes 
whfch couW be hypothesized to influence performance. These would include the 
relative opportunities for technical training for males and females later in school and so 
forth. 

Fourth, although the SIMS sample of national educational systems is moderately 
large (about 10% of all nations in the worU), the sample was not as representaHve of less 
developed countries and certain portions of the world (i.e., Latin and South America) as 
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one would like. Gender efiects may be different in these systems, although we do not 
know of any major arguments to suggest that these omissions would have greatly 
changed the overall pattern of results. Also in some of the analysis we were forced to 
use a reduced sample and thus had to give more weight to any oriying cases. 

Additionally, the SIMS data does not contain all of the variables one would like 
to have to analyze most **hidden curriculum** arguments. We only had a classroom 
level estimation of access to instruction. Although this h important, there is other 
research to suggest that within classroom access can be stratified by gender (Hallinan & 
Sorensen, 1987). Also, the data set is not as sensitive to a number of within country 
variations from which one could pull collaborating evidence of the processes we have 
looked at here (SchildPamp-Kiindiger, 1982; Theison, Achola & Boakari, 1983). 

Aside from these caveats, these data and other lEA data sets are the best available 
to assess academic performance comparatively. The careful standardization of test 
items, the attempts to make each within country sample representative of schooling, 
and the overall size of the number of students, teachers, classrooms and schools 
involved lend credibility to any results derived from these data. Until there are better 
data, these represent the best estimate that we have on the relative effects of gender on 
simihr mathematics tests around the world. 

Conclusion 

How well do the central assiunptions of the three general perspectives on gender 
phenomena fare in light of our evidence on gender and mathematics performance 
across 19 educational systems? We find mixed evidence for all three perspectives, with 
some variation in the clarity of the evidence. Our findings are most damaging to the 
naivest of biological argtunents and are moj t supportive of sociological perspectives if 
they are modified to consider specific institutional effects. Although there is dearly 
some kind of social psychological process at work here, a central assumption of a 
"hidden curriculum** perspective is not supported. We establish clear evidence of a 
world trend in schools to give females access to mathematical training at the 8th grade. 
Still, there are many unanswered questions within a general **hidden curriculum" 
approach to gender ^.ratification within schools. 

Perhaps most importantly, our results have demonstrated the advantages to 
considering gender effects comparativdy. Without this kind of perspective it becomes 
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very difficult to consider a full raitge of hypotheses. As we have shown, the lack of 
comparative data has led to the building of some theoretical perspectives on 
tmwananted assumptions about how females and males perform cognitii-e tasks. 

Most interesting is the evidence, particubrly ihe longitudinal results, suggesting 
that sodologicil processes may lessen gender effects on performance. Until now these 
processes have generally been left imtested within the area of mathematical 
performaiKre. Being in one society versus another has ramifications not only for the 
level of mathematics students masier, but also for the level of gender stratification of 
that knowledge. 

Male superiority in mathematics pexf ormance in schools has decreased over the 
last two decades, this trend seems to be elated to the greater incorporation of women 
in 0 the labor market It may also be rented t the even larger process of citizen 
formation and the incorporation of a modem notion of the individual (Ramerez & 
Boli-Bemiet, 1988). This needs to be tested further. 

This sociological evidence about the size and direction of gender differences in 
educatioiial systems merits furrier comparative consideration. This evidence, we think, 
especially merits consideration by proponents of theories that assume that gender effects 
on performance are only created by face-to-face or biological processes. 
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APPENDIX A 

Sample Size for Students and Classrooms for the Nineteen SIMS Educational Systi 



Country 


Student 
Sample 


Classroom 


Male:Female 


BFR 






1.14 


BRC 


2567 






ENG 


2678 




0.85 


FNL 


4484 




1.10 


FRA 


8778 




0.77 


HKG 


5548 


ion 


1.03 


HUN 


1753 


/u 


0.93 


ISR 


3819 




1.04 


JAP 


7785 


Oil 


1.06 


LUX 


2106 


1IY7 


0.97 


NCR 


1465 




Z6o 


NTH 


5500 




1 r\A 


NZL 


5978 




l.UZ 


ONT 


6^-22 




1 Ai 
1.07 




1356 


354* 


1.16 


SWD 


3585 


186 


1.10 


SWZ 


904 


25 


0.86 


THA 


4030 


99 


1.08 


USA 


6957 


250 


0.93 


Total 


77m 


2681 




•Number of teachers, not actual classroom 


count. 





ilG 



107 



APPENDIX B 

Means and Standard Eteviation for Female Status Variables 



\/j|ff>lAV%lA 

vcumII/Ic 


Mean 


Standard Deviation 


Fertility Rate 


14 


1.4 


% Female Use Contraceptives 


68.2 


183 


% Female 15-19 Married 


7.4 


8.8 


# remale in ational Legislation 


11.4 


9.7 


70 remaie taoor rorce 


32.7 


7.8 


% Female Labor Force (1980) 


36.6 


6.4 


% Female industrial 


23.9 


8.4 


% Female Service 


45.7 


72 


% Female Agricultural 


31.1 


10.8 


Gender Occupation 
Segregation index 


40.1 


7.0 


Ratio Female to Male Earnings 


73.8 


7.0 
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INTRODUCTION 



Although appropriate methods for analyzing hierarchically structured data 
have been available since the early 1970's (Dempster, Laird & Rubin, 1977; Lindley & 
Smith, 1972), appUcation of these methods to educational policy decisions in 
developing countries has been hampered by two important shortcomings: (a) the 
absence of computationally efficient algorithms for multi-level analysis, and (b) the 
lack of adequate data (sufficient cases at each organizational level) Recently, new 
computational methods have been developed that address the first problem 
(Goldstein, 1986; Longford, 1987; Raudenhush & Bryk, 1986), and data sets sufficient 
for their application have been collected in a number of developfng countries. This 
paper applies one of the techniques to longitudinal data recently collected by the 
Intemaaonal Association for the Assessment of Educational Achie' ement (lEA) in 
Thailand to answer three important questions for policy-makers: Which 
characteristics of schools and teachers are associated with student learning over 
time? To whac extent? And, are the differences among schools uniform across 
different types of students, or are some schools more effective with certain types of 
students? 

The comparative effectiveness of schoob, particularly the relative efficiency 
with which alternative inputs and management practices enhai^ce student 
achievement, has become the center of a lively debate ixr the literature (see, for 
example, Goldstein, 1984; Heyneman, 1986; ReynoUs, 1985; Rutter, 1983; Willms, 
1987). These issues have important implications for how governments and 
international development agencies should allocate their limited resources— 
whether they should concentrate on certain types of inputs (capital investment, 
lowering dass size) or should finance othen; (instructional materials, teacher or 
headmaster training, student testing). In the Umied States and United Kingdom, the 
debate was sparked by studies that ckdmed to identify effectiv e schools: those that 
enhanced student achievement more than other schools working with similar 
shidents and material inputs (see Raudenbush, 1987, for a recent review). In 
developing coimtries, research on school effectiveness has been limited; snidies that 
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have examined the effects of alternative inputs ca student achievement have not 
taken into account the expUdtly hierarchical nature of the explanatory models ?nd 
data. 

The "eifective schools" issue has been fueled by controverey over 
methodology, interprelaHon and data (for example, Siiotrik & Buretein, 1985). The 
most important methodological issue is the use of inappropriate statistical models 
for analyzing multi-level dita. The argument concerns how behavior at one level 
(eg., classroom, school, district) influences behavior at a different level (e.g., 
studente), and how to correcUy estimate these multi-level effects.^ Hierarchically 
structured data are common in social research, because social insHtutiors are 
typically hierarchically organized, but commonly used statistical techniques for 
dealing with related data may lead to biased estimate8.2 In particular, it has been 
established that, when observations within dusters on any stratum are more 
homogeneous than those between dusters, using ordinary least squares (OLS) 
regressions with such data can lead to biased estimates of regression coef fidents in 
unbalanced designs, and to substantially biased standard errors for these estimates 
even in balanced designs. Most poUcy research entails the use of unbalanced designs, 
and so a serious problem may arise when ordinary least squares regression esHmates 
are used for quantifying the effects from alternative inputs. 

Proper analysis of mulH-level data entails two distinct changes in thinking 
about data. First, the demands of inherently hierarchical data, such as much 
education data, nee- to be confronted at the conceptualization stage, so that sufficient 
numbers of units at each levd are sampled (e.g., adequate samples of schools a^H 
classrooms, in addition to sampling of students). Second, and more important, 
hierarchical analysis requires a major shift in how problems of organizational effects 
on individuals are viewed; instead of considering only effects of levels, effects on 
TCtetiQitthips are also modeUed. For exarrtple, in education, certain school or 
cL-^room interventions may affect not only average student achievement, but also 
lessen hypothesized correspondence between family background and shident 
achievement. Here are organlzaHon-level force serves to mediate individual-level 
effect 

Until recently, most discussions of multi-level analysis have remained 
theoretical, bounded by costs and computational requirements of existing analytic 
tools. However, the debate has been energized by the recent development of new 
analytk toob for analyzing mulH-levd data (Aitkin k Longford, 1986; Goldstein, 
1986; Mason, Wong k Entwisle, 1984; and Raudenbush k Bryk, 1986). Although the 
devdopment of the general EM algorithm (Dempster, Laird k Rubin, 1977) provided 

er|c J i 3 



Ill 



a theoretically aatisfoctoiy and computationally manageable approach to covariance 
component estimation in hierarchical linear modeb, it has seen Umited application 
in education poli^ research due to three shoitconrdngs: slow convergeiKe of the 
algorithm, lack of suitable generally available software, and lack of imderstanding of 
these tedudques in the education researdi commimi^. The new tools, by 
comparison, offr/r computational algorithms for variance component analysis of 
hierarchically stnictured data that converge rapidly axKi require only a moderate 
amount of computation in each iteration. The research described here utilizes the 
software VARCL whkh implements the Rsher scoring algorithm of Longford (1987) 
to address important policy questioitt regarding effectiveness aiul effkieiury of 
education in developing countries. 

To date, application of the new toob in education policy research has been 
1 imited to relatively few studies of schoob in developed cotmtries; to the best of our 
knowledge, thb b *he first such application to data from developing countries. 
Other research on developing countries has demonstn' . that school-level inputs 
have sigirificant effects on student achievement (for example. Fuller, 1987; 
Heynenum it Loxley, 1983; Heyneman k Jamison^ 1980; Lockheed tc Hanushek, 1988; 
Psacharopoulos k Loxley, 1986). However, previously employed aitalyses have not 
addressed the problem of multi-level data aiui may have over- or underestimated 
the importance of classroom, school and dbtrict-level effects, which are those that 
governments and donors can best address. 

DESIGN 



Analytical Framework 

Thb project makes an important methodological contribution by application 
of multi-level modeb to estimation of school and classroom effects on student 
achievenent The problem with ordinary least squares (OLS) estimates of school 
and classroom effects have been discussed at lengtii by Aitkin k Longford (1986) and 
Dempster, Rubin k Tsutakawa (1984X In short, these problems /irise from the nature 
of typical data in educational surveys. 

Educatioiuil surveys involve hierarchically structured data— pupib within 
classrooms, within schoob, within adminbtrative uiUts or regions. Every classroom 
(school, region) has its own idiosyncratic features that result from a complex of 
influences, iiKluding composition, teaching practices aiui management decisions. 
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As a consequence, observations on students (eg., their outcomes) are not statistically 
independent, not even after taking account of the available explanatory variables. 
This presenb a violation of the assumptions for ordinary regression (OLS). The 
main problem is not so much with the estimates thentselves as witi» ineir standard 
errors, and adjustinent techniques based on tiie "design effccT are not satisfactory for 
complex regression models. 

Variance component models are an exteitsion of ordinary regression models; 
the extension r fers to more flexible modelling of the variation. Pupils are 
associated with (unexplained) variation, but this variation has a consistent wilhin- 
classroom component, which itself has a within-school component, etc. Schools 
vary, dassroonts wititin schools vary and pupils wititin classrooms vary. 

Consider the regression models for data with two leveb of hierarchy (pupils j 
wiUiin classrooms i): 



(1) yij = a + bxij + czij + eij 



where a, b, c are (unknown) regression parameters, x and z are explanatory variables, 
y the outcome measure and the random term e is assumed to be a random sample 
ftt)m N(0, s2). Variation among the classrooms can be accommodated in tiie 
"simple" variance component model 



(2) yij = a + bxij -f czij -f ai -f eij 

where the a's form a random sample (i.Ld.) from N(0, t2) and the a's and the e's are 
mutually independent. The covar!ance of two pupils wltiiin a cL-.iroom is t2 
(correlation t2/Ii2 + s2j). If we knew the a's we could use them to rank the 
classrooms. The model (2) has the form of analysi>> of variance (ANOVA), witii 
distributional assumptions imposed on the a's. The advantages of this assumption 
are discussed by Dempster, Rubin and Tsutakawa (1981) and AiUdn and Longford 
(1986). In the former reference the term "borrowing strengtii" in estimation of tiie 
effects of small groups is used. In addition, some schools may be more "suitable" for 
pupils with certain backgrounds than otiien. This corresponds to variation in the 



within-scliool regia$fc>ns of y on x and z, and this situation can be suitably modelled 



ytj « a 4- bxij -I- czi j -I- ai -I- bpcij -i- tiZij > efj, 

or 



(3) yij»a + bxij + czij + ai + bixij + eij. 

The dassrcoohlevel random efiiects (at, bi) are assumed to be a ^andom sampb from 
N2(0/ $2); here $2 involves only 3 parameters, the variances of a and b ind their 
covariasice Extensions to larger nimtben of explanatory variables and to more 
complex hierarchies are described in the literature (e.g., Goldstein, 1987; Longford, 
1987; Raudenbush k Biyk, 1986). 

The maximum likelihood estimation procedures for such models used in 
this paper based on the computationally efficient Fisher scoring algorithm 
(Longford, 1987) implemented in the software VARCL (Longford, 1985). It provides 
estimates of regression parameters and (co-) variances, together with standard error 
for them, and the value of the log-likelihood, which peraiits formal likelihood ratio 
hypothesis testing. 

The Sample 

The lEA Second International Mathematics Study (SIMS) sample comprised 
of 99 mathematics teachera and their 4030 eighth-grade students and was derived 
horn a two-sta^e, stratified random sample of classrooms. The thirteen primary 
sampling units wera the twelve national educational regions of Thailand plus the 
capital, Bangkok. Within each rsgk)nr a random sample of lower-secondary schools 
was sdected. At the second stage, a random sample of one dass per school was 
selected from a list of all eighth grade mathematics classes within the school. The 
resulting sample represented a 1% sample of eighth grade mathematics classrooms 
within each region. This region, of course, does not distinguish between the school 
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and classroom levels, and so only inference about the aggregate of these effects is 
possible. 

At both the beginning and end of the school year, students were administered 
a mathematics test covering five curriculum content areas (arithmetic, algebra, 
geometry, stalisHcs and measurement). Teachers completed several instruments at 
the posttest, including a background questionnaire and a general classroom process 
questionnaire. Teachers provided information about teaching practices and 
characteristics of their randomly selected "target" class. Data about the school was 
provided by a school administrator. In the following sections, a description of each 
of the variables analyzed in this paper is provided (see Lockheed, VaU and Fuller, 
1987, for a more extended discussion); acronyms for the variables are given in 
parentheses. For easier orientation, the acronyms for pupil-level variables are given 
in capital letters and for group-level (refeion/school/dassroom) variables in lower 
case letters. This will be clear from Tables 1 and 2, which provide definiHons and 
summary statistics for all variables. 



Measures 

Mathematics achievement. Tlie lEA developed Hve mathematics tests for use 
in SIMS. One of the tests was a forty-item instaiunent called the core test The 
remaining four tests were thirty-Hve item instruments caUed rotated forms and 
designated A through D. The five test inshruments contained roughly equal 
proportions of items from each of the five cviTiculum content areas, except that the 
core test contained no statistics items. For purposes of tftis analysis we regard the 
instruments as parallel forms with respect to mathematics content. 

The lEA longitudinal design called for students to be administered both the 
core form and one rotated form chosen at random at both pretest and posttest. In 
Thailand, students were pretested using the core test and one rotated form. At 
posttest, students again took the coie test and one rotated form, but were prevented 
from repeating the rotated fbnn taken at pretest. Approximately equal numbers of 
shidents took each of the rotated forms in both administrations. 

One goal of this analysis was to predict posttest achievement as a function of 
pretest performance and of other determinants. Since students took the core form 
twice, the core form posttest score reflects, to some degree, familiarity with the core 
test items. Instead of using the core test, therefore, we analyzed scores obtained from 
the rotated forms, after they were equated to adjust for differences in test length and 
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Table t: Variable Namesy Descriptions and Means (Pioportions) 
of Student-Levd Variables for Thiee Data Sets 



Variable 




Means/ rapomons 
Data Data Data 


Name 




Setl 


Set2 


Set 3 


Sample 










Students 




2076 


2804 


3025 


Classroom 


60 


80 


86 


Student.Leve! Variablea 








XROT 


Pretest mathematics achievement score 


9.15 


8.83 


8.83 


XAGE 


Age in months 


170.94 


171.05 


171.09 


XSEX 


Student sex (0 s female; 1 s male) 


.53 


.53 


.53 


YFOCa 


Fadier's occupational status: 










Uhskilled or semi-skilled worker 


.15 


.15 


.15 




Skilled Worker 


.44 


.45 


.46 




Clerical or sales woiker 


.26 


.26 


.25 




Pjrofesbional or managerial worker 


.15 


.15 


.14 


YMEDUC 


Motfier's educational attainment 










Very little or no schooling 


.26 


.26 


.26 




Primary school 


.58 


.58 


58 




Secondary sdiool 


.09 


.0-; 


.09 




CoQege, university or some form of tertiary 


.07 


.07 


.06 


HCALC 


Calculator at home (0 s no; 1 s yes) 


.31 






YHLANG 


Use language of instmctton at home (0 « no; 1 s yes) 


.49 






YMOREED 


Educational expectation 










Less than two years 


.08 


.08 


.08 




Two to four years 


.30 


.31 


.30 




Five to seven years 


.41 


.41 


.41 




Eight or more years 


.22 


.20 


.21 


YPARENC 


Parental encouragement G « hi^) 


2.12 


2.10 


2.09 


Y?ERCEV 


Perceived ma Aematics ability (1 « high) 


4.05 


4.05 


4.05 


YFUTURE 


Perceived future importance of mathematks (1 » low) 2.06 


2.05 


2.06 


YDESIRE 


Motivation to succeed in mattematics (1 » low) 


5.47 


5.47 


5.47 



difficulty. In this analysis, we used equated rotated form formula scores for both 
pretest OCROD and posttest (YROD measures of student mathematics achievement.^ 
Student background characteristics. Basic backgroimd information about each 
student included his or her sex (XSE ), age in months (KAGt), highest maternal 
education (YMEDUC), paternal occ^; ational status (FOCCD, home language 
(YHLANG) and home use of a four-function calculator (YHCALC). Paternal 
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TaWe 2: Variable Names^ Descriptions and Means (Proportions) 
of Group-Levd Variables for Three Data Sets 



Means/Propnrtinnfi 

Variable Data Data Data 

Name Set 1 Set 2 Set 3 



Sample 

Students 2076 2804 3025 

Obwoom 



60 80 86 



Student-Lgvd VariahW 



12.94 


12.97 




1.27 


1.44 


1.41 


.46 


.47 




195.04 






14.86 


15.81 


15.93 


.57 


.62 


.62 


3.95 






.33 


.37 




29.04 






7.25 






43.61 


42.61 




> .22 


.20 


.18 


.55 


.56 


.58 


2.15 






.85 


.83 


.81 


.34 


.40 




26.84 






19.40 


2027 


20.33 


53.76 


5457 





SPaSl District per capita income (in 1000 bahts) 

SENROLT Nuniberof soidenb in school (in 1000) 

SSTEAM Ability groupings for instructfon (O^no; Isyes) 

SDAYSYR Oays in school year 

SPUTEAR Pupil-teacher ratfo in school 

SQUALMT % of teadters in school qualified to teach math 

TECMATH Semesters of post-secondary mathematics 

TSEX Teacher sex (0« female. Is nude) 

TAGE Teadier age in years 

TEXPTCH Years of teaching experience 

TNSTUDS Years of students in target class 

TMTHSUB Math curriculum (0>remedial or normal, l=enrid 

TXTBK Hrequent use of textfxwk (O^no; l=yes) 

CEFEED Frequent individual feedback 

TWORKBK Useof published workbooks (0>no;l>yes) 

TVISMAT Use of commercial visual materials (O^no; l=yes) 

TADMINl Weekly minutes spent in routine administration 

TORDERl Weekly minutes spent in maintaining class order 

TSEATl Weekly minutes students spent at seat or blackboard 53.76 



occupation was classified into four categories; (a) unskilled or semi-skilled worker, 
(b) skilled worker, (c) clerical or sales worker, and (d) professional or managerial 
worker. Highest maternal education was also classified into four categories: fa) very 
little or no schooling, (b) primary school, (c) secondary school, and (c) college, 
university or some form of tertiary education. 

Student attitudes and percgprinnff Five indices of student attitudes and 
perceptions were also included. Student educational expectations (YMOREED) were 
measured by a single item that asked about the number of years of full-time 
education the student expected to complete after the current academic year. Ihe 
following categories were defined: (a) less than two years, (b) two to four years, (c) 
five to seven year", and (d) eight or more years. Parental encouragement 
(YPARENO was measured by a four-item index composed of responses on a Likert- 
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type scale in which students described their parent's interest in, and encouragement 
for, mathematics achievement. For example, for the item "My parents encourage 
me to learn as much mathematics as possible' ; response altmiaiives ranged from 
"exactly like" the student's parents (= 1) to "Not at all like" the student's parent (= 5). 
The four items comprised a single factor, with principal component factor loadings 
ranging from .72 to .83 and communality of 2.43. A low score represented greater 
parental support. Perceived mathematics ability (YPERCEV), perceived usefubtess of 
mathematics (YFUTURE), and motivation toward mathematics achievement 
(YDESIRE) were all developed from a factor analytU of the student attitude survey, 
which contained Likert-type it*;ms having response alternatives ranging from 
"strongly disagree" (= 1) to "strongly agree" (=5). Factors were initially identified 
through VARIMAX factor analyses, and then confirmed through principal 
component analyses, from which factor scores were con'^tructed. For YPERCEV, a 
low value represents a positive attitude; for YFUTUT(E and YDESIRE a high value 
represents a positive attitude. 

School characteristics. Data on sbc school characteristics are analyzed in this 
paper (a) school size, as indicated by the total number of students enrolled in the 
school (SENROLT) . (b) presence of abiUty grouping (SSTREAM) . (c) length of the 
school year indays (SDAYSYR) . (d) student teacher ratio in the school (SPUTEAR) . 
(e) percentage of the teaching staff qualified to teach mathematics (SOUALMT) . and 
district-level per capita income in 1981 (SPCI81) . 

Teacher characteristics. Four teacher characteristics are analyzed: (a) sex of 
the teacher (TSEX), (b) remedial or typical versus enriched mathematics subject 
matter (TMTHSUB) . and (c) whether or not the teacher used textbooks frequently in 
the class (TXTBOOK) . 

Teaching practices . Six variables referring to teaching practices are considered: 
(a) providing feedback to students (a composite index of five elements of teaching 
practice: commenting on student work, reviewing tests, correcting false statements, 
praising correct statements, and giving individual feedback) (CEFEED) ; number of 
minutes per week the teacher spent on (b) routine administration (TADMINl) . (c) 
maintaining class order (TQRDERl), (d) monitoring assigned seatwork (TSEATl ); (e) 
using commercially produced visual materials (TVISMAT) . and (f) using workbooks 
rrWORKBK) . In sununary, the data contain information on 32 variables about 4030 
pupils from 99 schools. Of the 32 variables, 13 are student characteristics, 5 variables 
refer to the school, 4 to the teacher, 9 variables are defined for the classroom, and one 
variable is a characteristic of the district (catchment area). The distinction between 
variables defined for pupils and for classrooms/teachers/schools (henceforth groups, 
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iince they are confounded in the design) is important because they play different 
roles in explanation of variation. Also, it should be noted that the complete data set 
consists of 13»4030 + 19^ = 54,271 units of data, although conventionally it would 
be conceived, and stored on a computer, as a data set with 32*4030 = 128,960 units of 
data. The data contain relatively more information about the groups (19 variables 
for 99 units) than for tiie pupils (13 variables for 4030 units). Arguably, group-level 
variables are also more reliable, because tiiey refer to school or teacher recoitis, and 
are responses from adult professionals, whereas the responses of pupils are subject to 
test-performance variation, recall of iiamily circumstances and arrangements, 
variable interpretation of the questionnaire items, and so on. Also pupil-level 
variables, e.g., SES or XROr, have a large group level component of variation; 
gtoups vary a great deal in their composition (means, standaid deviation, etc.) of 
these variables. Hence, not only the 19 group-level variables, but also to some extent 
the 13 pupil-level variables potentially explain group-level variation among the 99 
groups, whereas only tiie 13 pupil-level variables can explain some of the pupU-le el 
variation of the outcome scores of 4030 pupils. 



RESULTS 



The response rate for the 13 pupil-level variables is between 93-100 percent. 
There is no obvious pattern of missingness among the pupils; complete pupil-level 
records r-e available for 3466 individuals (86%). The group-level data are available 
for between 78-99 schools, but only 60 schools have complete records, and witiiin 
these schools only 2076 pupils also have complete pupil-level data (51.5%). 

Our intention is to carry out a multiple regression analysis of the data, and 
seek a Unear prediction formula for the po.ttest scale score (YROT) in terms of the 
pretest scale score (XROT) and a suitable sabset of tiie 30 other (explanatory) 
variables. For a model which involves a given set of variables we would use the 
data on all pupils and schools, for whom all Q\e responses on tiie variables in the set 
are available (listwise deletion). Thus for a smaller, more parsimonious, set of 
variables we have a larger sample of pupils and schools. 

Our general sh-ategy in tiUs modelling approach is as follow: we start with the 
data set obtained by listwise deletion witii respect to aU variablts (2076 pupils in 60 
schools), fit regression models to this data set, apply a conservative criterion (to be 
specified below) to exclude variables from the obtained regression formula, thus 



constructoig a restricted set of explanatory variables. For this restricted set of 
variables (including the outcome YROT) we apply listwise deletion, which leads to a 
larger sampk of pupils and schools. For this new data set we again fit regression 
models, simplify the regression formuk, if possible, and continue on until no 
further reduction of the set of variables, and extension of the data set obtained by 
listwise deletion , is possible 

Usually it cannot be assu ned that the unavailable data are missing at 
random, i.e, the distribution of a variable among the pupils fiiom whom we obtain 
valid responses is similar to the distribution among the pupils whose responses are 
not available (missing). In educational surveys, typically, higher ability pupils, those 
with higher social status, etc, tend to have higher response rates, implying bias in 
estimates of certain population means, as well as in regression coefficients obtained 
from simple regression* Missingness at random is an imnecessarily stringent 
criterion for enstuaig that omission of the subjects with missing data has no effect 
on the restilts of a regression analysis. U is sufficient to have conditional 
raiKlomness, given the explanatory variables. It means that for any combination of 
explanatory variables the distribution of the outcome among the pupils in the 
sample is identical to those excluded from the sample by the listwise deletion 
procedure. Intuitively, such an asstunption becomes less stringent the more 
explanatory (conditioning) variables are used. On the other hand, a larger set of 
explaxuitory variables implies a larger proportion of subjects whose data are not used 
in the analysis. 

An indication of the extent to which the criterion of conditional randomness 
is relevant can be deduced from comparisons of model fits for two different samples: 
the maximal sample obtained by listwise deletion with respect to the set of 
explanatory variables used in the considered model, and the sample obtained by 
listwise deletion Mdth respect to a more intensive, or complete, set of explanatory 
variables. In a few of such comparisons, reported below, we found a dose agreement 
in several pairs of such analyses. 

Variance Component Models 

The hierarchical structure of the data, with pupils nested within groups, 
requires a form of regression analysis which takes into accotmt the two separate 
sources of variation. Separation of the variation due to pupils and due to 
schoob/dassrooms is also of substantive interest, because the btter is a measure of 
the size of unexplained differences among the schoob/dassrooms. 

i 
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Relevance of variance component methods for analysis of data with 
hierarchies has been established by Goldstein (1986), Raudenbush and Bryk (1986 and 
Aitkin and Longford (1986); they address the previoiisly-mentioned problems with 
the use of the ordinary regression methods when the assumption of independence 
of the observatioiu is not satisfied. 

Variance com>jonent model s com->ared with OT.5 Variance component 
methods involve the explicit modelling of the student md group variation, and 
afford flodbility of modelling of the group variation, whfch cannot be allowed for in 
ordinary regression. The specification of a variance component model is necessarily 
more complex than for the ordinary regres-iion. In standard situations, first the list 
of the regression variables involved in explanation of the outcome for a typical 
(average) group has to be declared, and then a sublist of this list shouU be declared, 
which contains the variables for which the within-group relationships vary from 
group to group. The full list of variables, referred to as the FIXED PART, is 
analogous to the list of the explanatory variables in ordinary regression. The sublist 
(RANDOM PART) may contain only pupil-level variables, which are not constant 
vtdthin all the groups, because within-group regression coefficients on group-level 
variables cannot be identified. 

Variance component modeb involve two kinds of parameters. The fixed 
effects parameters refer to the regression relationship for the average group. Their 
interpretation is analogous to the regression parameters in the oidinary regression. 
The random effects parameters are variances and covariances that describe the 
between-group variation in the regression relationship. Of prime interest are the 
sizes of the variances. Zero variance of a regression coefficient corresponds to 
constant relationship across the groups. In onier to obtain information about the 
variation we require, in general, a substantially larger number of pupils and groups 
than for the regression parameters. We can therefbrt expect to find a small random 
part, containing only a few variables, as a sufficient description of the variation, 
whereas tiie fixed part may contain most of tiie available explanatory variables. 

C -le important aspect of the separation of the two sources of variation is in 
distinguishing between pupil and pupU-level variation. This comes out very clearly 
in the following examples: it turns out that we have abundant group-level 
information, le., a good description of the between-group variation, but a much 
larger proportion of the student-level variation remains unexplained. 
To fix ideas, we consider first a specific model 
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yij«ek:^Kbk + dj + eij 

where the indict i = i, nj = \, N2/ k » 1, .... K, represent the pupils, groups, and 
the variables.r/^pectively. The b*s are the regression parameters, and the d*s and e*s 
are the group- and pupil* level random effects, and are assumed to be independent 
random samples from the normal distribution with zero means and variances s^ 
and t^. In aiudogy with the ordinary regression we can define the as the 
proportion of variation explained as 

R2xl.(s2.ht2)/(s2niw + t2,3w), 

where the subscript ''raw'' refers to the variance estimates in the "empty" variance 
component model 

Yij = m + dj + eij 

It is advantageous, however, to define two separate r2's which refer to the 
two levels of the hierarchy: 

Rp2 = (l.s2)/s2iaw 

Rg2 = (l-t2)/t2raw 

for pupils and groups, respectively. 

Example 1: Ordinary regression . In the present aiudysis, for a data set 
obtained by listwise deletion with respect to a set of variables considered below (3136 
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pupib in 88 schools) we have for the simple regression of posttest (YROD on pretest 
(XROD: 



B{YR0T1« 4.892+ .818XAOT 
(.015) 

and 80 r2 « 1 . sZ/sZ - w » 486. 



The standard errors for the regression estimates will be given throughout the 
paper in parentheses in the line below the regression parameten. For example, .015 
above is the standard error for the regression coefficient on XROT, .818. The 
corresponding t-ratio is .818/.015 - 54.5. in this model, identification of pupils 
within schools is completely ignored, and the pupils are assured to be a randomly 
drawn sample from the population of aU pupils in a given grade in the country. A 
pupU with a given pretest score X is expected to score 4.892 + .81 8X on the posttest 
administration. This pupU would be, however, likely to have a score quite 
substantially different from this predictton, because the variance of aU the pupils 
with a given pretest score is s2 ^ 42.56 (standard deviation — V«56'. 6.5). The 
prediction is still a marked improvement if we only used the overaU mean of the 
YROT scores, 12.2, ar a prediction for the pupa Then the standaid deviation would 
be 9.1 V82.80. 

Since in future text it will be dear from the context whether the paremeter or 
its estimate U mea At, the notation will be abandoned. 



Example 2; (Simple) Varianri> component moHplr 



Yij = m + dj + eij 
s2iaw = 55.56 
t2raw = 25.65 

131 



The variation of posttest scores has a substantial group-level component; the 
variance component ratio is r « 25.65/81*21 ■ 316. The variance component 
regression model is given as: 



E [YROn » 5.841 + .699 XROT 
(.018) 

s2>3855 
t2«4.78, 

and so we have r2 » 1 - 43.44/81.21 » .466, arid 

Rp2«l. 3855/55.56 « ,306 
Rg2»l- 4.78/25.65 «. 814. 



Thus, if we make allowance for the within school homogeneity of the posttest 
scores, we obtain a prediction formula for the posttest score (Y = 5.841 *f .699X) that is 
substantially di^ent from the OLS regression obtained in Example 1. Note also by 
how much the schooHevd variation has be^n reduced. Table 3 presents the 
comparison between the simple OLS and simple variance component models* 
Qearly the latter extension of the for variance components is more informative. 
The pretest score XROT is a powerful predictor of the posttest score YROT. But 
whereas it explains more than 80% of the variation among the groups, the 
proportion of the pupii-level variation explained is only 30%. The school-level 
variation in the outcome scores reflects the pretest score to a great extent Some of 
the remaining within-group variation may be explained by the other explanatory 
variables, but they are not likely to have as dominant an effect as the pretest score. 
The variation associated with the testing and scoring procedure, which couJd be 
demonstrated in an experimeru with repeated administration of the test, use of 
alternate fores, etc., will renudn as a con^ponent of the pupil-level variation. Thus, 
whereas group*level variation can potentially be reduced to 0, pupil-level variation 
has a component that cannot he explained by any explanatory variables. In ideal 
circumstances (and in our case, almost) we can explain completely wh) 'how schools 
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Tabled: CofnpMteon of OLS and VCS Models 
of Gfade 8 Mathematk* Port-Teat Predicted Rwn Pretest, 
Thaiknd 1981-82 



Model 



gmptyimdd 

0^ 



OLS 



82.80 



VCS 



55.56 
25.65 



Rcgrearionmndd 

Intercept 
Coefficient 
St error coeff. 

r2 
V 



4.892 
0J18 
0.015 

4236 



0.486 



5.841 
0.699 
0.018 

3835 
4.78 



0306 
0.814 



vary; the variance of schools in the later modeb is v«ry smaU. But pupU-kvel 
variation cannot be completely explained; there will always be the unexplained (and 
in our case unidentifiable) within-pupil variation. Since every pupU provides only 
one outcome score, the within-pupU and withi: ^roup variation cannot be 
separatel 

The raw variance component ratio is .316, but for the model v^Hth the pretest 
score the ratio drops to .110. If pretest score is ignored, groups appear to have 
substantial differences. But schools appear to be much more similar (homogeneous) 
once we take account of the pretest scores, i.e., they are much more siniilar in the 
way they "convert" initial ability into outcome. 

If a group-level explanatoiy variable were added to the regression model, it 
would result in % reduction of only the group-level variance, which has already been 
substantiaUy reduced. Therefore there is a Umited scope for important group-level 
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explanatory variabUa* By co«Aparison, among the pupil-lsvd variables there nuiy be 
ones that explain a great deal of the remaining pupil-level variation* 

Inchision of a pupU-levd variable in the regression model will cause a 
reduction of both the pupil* and group*level variances. The relative sizes of the 
redtictions of ^he two variances will depend on how the variation of the explanatory 
variable decomposes into between- and within-group variance. Hence the 
potentially most important pupiMevd explanatory variables are those with little 
between-gioup variation. 

Examples: Variable slopes model The variance component model discussed 
above can be further generalized to the model which allows variable slopes on the 
pretest: 

Yij = bo + bi xij + doj + di|(xij - x) ^ siy 

where (do|/ di j) form a random sample from N(0, Sd) and e*s are i.Ld. N(0, o^X The 
maximum likelihood estimates for this model are: 



The software VARCL used for maximum likelihood estimation in variance 
component models estimates the square root of the variances in Sd# and produces 
standard errors for these estimates: 



The value of the deviance (-2 log-likelihood) is 20^96.3. Using the 
conventional t-ratio we conclude that the slope-variance Sd,22 is not significantly 
different from 0, and so we can adopt the simple variance component model. More 
formally we can use the likelihood ratio test for comparison of the two variance 



bo» 5.832 
bi» .687 (.019) 
S2= 38367 




Sd,ll= 2.224 (.202) 
.0645 (.0338) 
Sd.l2= .0805 (.0311) 
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component models. The deviance for the simple model is 20499.9, 3.6 higher than 
for the modd with variable slope The simpler model is obtained from the latter 
model by cor^tnining to zero the slope variance Sd^ and the slope-by-Liteicept 
covariance Sci,i2. The fitted correlation of the slope and intercept is .56; the variance 
matrix Sd Is non-singular. Constrainb on the two parameters (degree of freedom) 
have led to an increase of the deviance of only 3.6 (to be compared with the chi- 
square tables of critical values fier 2 df.), a nd hence we can declare that we have 
fouiKl insufficient evidence for variable slope of the posttest «n pretest among the 
schools. The differences among the schools, described by the variance t^ in the 
simple variance component model are substantial and statistically significant; the 
formal likelihood ratio test for the hypothesis that t^ > 0 is obtained by comparison of 
the deviance of the ordinary regression and the simple variance component models. 
The ordinary regression deviance (-2 log-likelihood, not the same as the residual . 
sum of squares!) is equal to 20662.6, 162.6 higher than the deviance for the simple 
variance component model (chi-square with 1 degree of freedom). Also the t-ratio 
for t^ is larger. 

Making Inference about variable relationships is uf substantive importance in 
school effectiveness studies. Schools are expected to vary in their performance, after 
accounting for differences in the initial ability of the pupib, but other more complex 
patterns of beiween-school variation may arise; Schools may be relatively more 
cuccessfid in teaching children with certain background characteristics, they may 
either exaggerate, or reduce differences among the pupib at enrolhnent. 

Variable relationships are intimately connected with variance heterogeneity. 
For illustration, we consider the variable .slope model discussed above. The fitted 
variaiKe of an observation is 



38.367 + 4.947 + 2»(XR0T - 8.912A08054 + (XROT - 8.912)^ •.00416; 



it is a quadratic function of the pretest. The minimal variance occurs for XROT*= 
8.912 - .0805/.0042 = 10.45, and is equal to 41.75. Only two pupils in the whole sample 
have scores lower than XROT*. Larger values of the explanatory variable XROT are 
associated with larger variance. For XROT = 9 (near the mean) the fitted variance is 
43.33, and for XROT = 30 (near the sample maximum) the fitted variance is 48.56. It 
woukl appear that for low-ability pupils the choice of the school they attend is 
slightly less important than for high-ability pupils. We have to bear in mind. 
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though, that we are dealing with an observational study, not with an experiment, 
and in reality pupils, or their parents, do not exercise completely free choice over the 
school Thus a causal statement or a prediction about a future manipulative 
procedure can be made only under the condition that all the other circumstances in 
the educational system remain intact This is usually a very luirealistic assumption.. 
Comparison of modela. The comparison of the regression relationship (fixed 
effects) is instructive We have 



t Ordinary reirression: 

ElYROn* 4.892 » .8WXROT 
(.015) 



2. Simple variance component model 

E [YROT =5.841 + .699^YROT 
(.017) 



3. Variable i lopes 

E{YROT]= 5.832+ .687*XROT 
(.019) 

The estimate of the regression coefficient on XROT in ordinary regression is 
substantially different from the estimates in the two variance component models. 
Ignoring the hierarchical structure of the data would lead to different conclusions, 
say, for pred^crion of posttest ♦YROT) ftom pretest (XROT). In other words, whereas 
the OlS estimate could be interpreted to mean that each point on the pretest is worth 
.82 points on the posttest, the VCS estimate more accurately places this value at .69 
points. 



MulHplc Regression Models 

The purpose of this ""^xiion is to obtain the most parsimonious simple 
variance component model of gr ^^^ve i mathematics learning in Thailand, given the 
available data. 
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We proceed as follows. First we fit the simple variance component model 
using the largest data set obtainable by listwise deletion with respect to a given set of 
variables. Second, we apply an exclusion criterion, defined below, to eliminate 
variables fix>m the model, aeating a new model, and then we fit this new model on 
the same data. These three steps are repeated, with listwise deletion with respect to 
the restricted set of variables, until no more variables can be eliminated. 

Regression with all the variahlps. We begin with fitting simple variance 
component models (VCS), Le., models involving no variable slopes, to the data set 
obtained by listwise deletion with respect to all the available "ariables. This data set 
contains 2076 pupils in 60 schools. 

The ordinary regression fit (OLS) of the posttest on pretest is 

E[YR0T1 = 4.882 + .8 l7»XROT, s2 = 42.20, 
(.017> 

which is in close agr'^ment with the OLS fit reported above for a brger data set (3136 
pupils in 88 schools). The corresponding simple variance component model fit is: 

E[YROT] = 5.670 +.720»XROT 
(.020) 

s2 = 38.79 
t2 = 4.02 



Compared to the larger data set, we find some discrepancies: the fitted regression 
slope for the smaller data set is higher (.720 vs. .699), and the group-level variance is 
smaller (4.02 vs. 4.78). Variation of the slope on XROT is not significant in either 
sample, but it is two-and-a-half times as great as the larger data set (.00416) than in 
the smaller one (.00166). It appears that the 28 schools added to the data are more 
likely to have lower regression slopes, and contain proportionately more extreme 
schools (very "good" or very "bad"), because the larger sample has larger group-level 
variance t2. We emphasize that all these differences may arise purely by chance, 
rather than as a iv ilt of non-random missingness of data, bit they can have a 
substantial effect or. the inferences drawn. 
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The OLS and VCS model estimates for the 2076/60 data using aU the 
explanatory variables are given in Display 1. The dominant explanatory power of 
the pretest score XROT is obvious^ judging not only by the t-ratio for its regression 
coefficient (3138 for OLS and 30.80 for VCS), but also by the comparison of the 
variance component estimates across models. The nw variance component 
estimates are* 

sraw^ = 5730 
traw2 = 2C^? 

The pretest score XROT on iiS own leads to reduction of these variances to 38J9 (Rp2 
= 32%) and 4.02 (Rg2 = 86%), but the other 30 variables reduce the pupU-level 
variance only marginally (to 36.8, Rp2 - 36%). The group-level variance is almost 
saturated (132, Rg2 = 95.5%). It appears that we have abundant information about 
the groups, but we are less successful in expl ition, or suitable description, of pupil- 
level variation. 

The relative^ large number of group-level variables raises the concern about 
multicoilinearity, i.e., competing alternative descriptions of the data. To deal with 
this problem we apply a conservative criterion for exclusion of explanatory variables 
from oiur models. We regard a variable as not "important" for the fixed part of the 
VCS model if the t-ratio of its regression coefficient is smaller than 0.9 at the first 
stage of model reduction and 1.0 thereafter. In the first round of simplifying the 
model we use the 0.9 criterion to exclude two pupil-level variables (HCALC and 
YHLANG) and six group-level variables (SDAYSYR, TECMATH, TAGE, TEXPTCH, 
CEFEED, and TADMINl) from the full list of 31 variables. 

Second model. Next we estimate a VCS model fit with this shorter list of 23 
variables. The results are shown in Display 2. Exclusion of these variables (8 degrees 
of freedom) has virtually no effect on the retained regression parameters and their 
standard errors (compare Displays 1 and 2; the exception is TVBMAT, which now 
fails to meet the inclusion criterion), and the increase in the variance components is 
only marginal in particular for the group-level variance. The r'lfference in 
deviances is 33 (cg^X 

Then we obtain the largest data set obtainable by Ustwise deletion with respect 
to the retained variables; this yields data for 2804 pupils in 80 schools. We then 
compute tiie variance component analysis for th* data set; results are given in 
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Display 1: OLS and VCS Model Estimates for 2076 Students and 
60 aassroofns/Schools Using All 31 Explanatory Variables, 
Thailand 1981-82 



Variable 




OS 




VCS 


Estimate St. Error 


oi* error 


GRAND MEAN 


18.603 




19.717 


- 


XRUT 


.680 


021 


.647 


.021 


XAGE 


-.080 


016 


-.U/7 


.016 


XSEX 


.732 


.301 


.969 


.319 


YFCX:CI 


.174 


.431 


.033 


.434 




-.631 


.462 


-.646 


.460 




-.178 




-.239 


.542 


YMEDUC 


.021 




-.039 


.325 




-.129 




-.::7 


.556 




-.686 


661 


-.899 


.663 


HCALC 


-.120 


'^10 


-.217 


.309 


YHLANG 


.203 




.012 


.341 


YMOREED 


1.087 




1.074 


.541 




1570 




1.537 


.541 




1.638 


.593 


1.610 


.589 


YPAREBC 


.225 


.137 


.249 


.136 


YPERCEV 


-.980 


.160 


-1.020 


.161 


YFUTURE 


.574 


.168 


.526 


.167 


YDESIRE 


.277 


.236 


.228 


.233 


SPCI81 


.061 


.042 


.073 


.060 


SENkOLT 


.422 


.263 


.417 


.386 


SSTEAM 


-.426 


.358 


-.500 


.512 


SDAYSYR 


-.006 


.020 


-.010 


.029 


SPUTEAR 


-.152 


.051 


-.170 


.075 


SQUALMT 


1.023 


.342 


.1.029 


.494 


TECMATH 


-.035 


.037 


-.044 


.053 


TSEX 


-.580 


.336 


-.619 


.481 


TAGE 


.009 


.032 


-.001 


.046 


TEXPTv-H 


.014 


043 


.038 


.064 


TNSTUDS 


.035 


.018 




.025 


TMTHSilB 


1.725 


.432 


1.941 


.628 


TXTBCX)iX 


1.602 


.338 


1.650 


490 


CEFEED 


.148 


.203 


.209 


.290 


TWORKBK 


-1.104 


.218 


-1.124 


.314 


TVISMAT 


.380 


.331 


.461 


.480 


TADMINl 


-.003 


.004 


-.003 


.006 


TORDERl 


-.037 


.012 


-.039 


.016 


TSEATl 


.011 


.005 


.011 


.007 


Variance 


38.031 








Pupil-level Variance 






36.809 




Pupil-level Sigma 






6.067 




Group-level Variance 






1.317 




Group-level Sigma 






1.148 


0.192 


Deviance 






13424.947 
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Display 3. We see that the regression coefficients for the pupil-level variables are 
stable across the data sets (compare with Displays 1 and 2), but for the group-level 
variables there are substantial disaepandes. The are tv'o separate, but possibly 
complementary, explanations for these discrepancies: multicollinearity and non- 
random missingness of data. Multicollinearity would cause the regression estimates 
to be sensitive to changes in the data, in our case to inclusion of over 700 new 
observations. As an alternative, the discrepancies could arise as a result of the non- 
random missingness in our data, Le., if the two data sets have genuinely different 
regression characteristics. A suitable indication, though not a fool-proof check, for 
the latter possibility is obtained by fitting of models with identical specifications for 
the different ''working'' data sets. We have fitted the reduced second model (Display 
2) to the larger data set (Display 3), and although different values of the group-level 
regression coefficients were obtained, it turns out that the reduced list of variables 
also provides an adequate description for the data (as judged by the likelihood ratio 
rniterion). The pupil-level regression coefficients differ only marginally. 

We conclude, therefore, that multicollinearity is the more likely cause of the 
discrepancies in the estimates; we have too many group-level variables, and so the 
parameter estimates are subject to large fluctuation with small changes in the data. 
The explanatory variables provide sufficient conditioning for the outcome data to be 
missing at random given the available explanatory variables. 
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Display 2: OLLS and VCS Model Estimates for 2076 Students and 
60 Classroom/Schools Using 23 Exfrfanatoiy Viriables, 
Thailand 1981-82 







OS 




VCS 


Variable 


Estimate St. Error 


Estimate 


St. Error 


GRAND MEAN 


18.118 


- 






XROT 


.685 


.020 


.650 


.021 


XAGE 


-.080 


.016 


-.076 


.016 


XSEX 


.723 


.299 




^ift 


XFOCa 


.118 


.426 


•MOO 






-.621 


.457 


-.651 


.457 




-.139 


338 


-.212 


.541 


YMEDUC 


.037 


.326 


-.ll2o 


.325 




-.068 


.559 


lie 
-.115 






-.604 


.656 


occ 

-.855 


.660 


YMOREED 


1.115 


.545 


1 AO'S 


.540 




1568 


.543 


it Coi 

1.521 


540 




1.666 


.591 


l.oOy 


.589 


YPARENC 


.238 


.137 




.135 


YPERCEV 


-.970 


.160 


-l.UlU 


.161 


YFUTIIRE 


370 


.168 


Cl^ 

.526 


.167 


YDFSIRE 


.287 


.235 


.234 


.23^ 


SPCE81 


.050 


.038 


ACQ 


.056 


SENROLT 


309 


251 


.540 


.373 


SSTEAM 


-M\ 


.324 




.472 


SPUTEAR 


-.178 


.046 


-.198 


.068 


SQUALMT 


1.062 


.327 


1.090 


.430 


TSEX 


-.518 


.314 


CO^ 

-.536 


.460 


TNSTUDS 


.036 


.017 


•\JOO 


noc 


TMTHSUB 


1.802 


.409 




.oU4 


TXTBOOK 


1 AiQ 

1 .ut? 


•OVJ 






TWORKBK 


-.1028 


.204 


1 nio 
- 1 .U3y 


.3UU 


TVISMAT 


.368 


.322 


.393 


473 


TORDERl 


-.040 


.010 


-.043 


.014 


TSEATl 


.010 


.005 


.011 


.007 


Variance 


38.108 


6.173 






Pupil-level Variance 






36.855 




Pupil-level Signa 






6,071 




Group-level Variance 






1.351 




Group-level Signui 






1.162 


.191 


t)eviance 






13,428.295 
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Display 3: OlS and VCS Model Estimates for 2804 Students and 
8b Qassrooins/Schools Using 23 Explanatory Variables, 
Thailand 1981-82 



a& VCS 

Variable Estimate St. Error Estimate St. Error 



GRAND MEAN 


17.659 




17314 

A/ .wA^ 




XROT 


.699 


.017 


.634 


.019 


XAGE 


-079 


.014 


-.073 


.014 


XSEX 


.746 






971 


YFCXia 


197 




101 


,00/ 




-.403 






•OOO 




.089 


458 






YN5EDUC 


306 






97A 




088 




149 






-018 

•WAV 


%7 






YMOREED 


.861 


476 


./OQ 






1.086 


475 


1 015 


.*too 




1.617 


519 

A7 


1 *?d9 


R19 


YPARENC 


.388 


118 

. A A(/ 


•Of %J 


116 


YPERCEV 


-.1083 


137 


-1 1^1 


1^6 


YFUTURE 


.576 


142 

. A^^ 




141 


YDESIRE 


.493 


.201 


.439 


.198 


SPCI81 


-.029 


.033 


0.035 


.057 


SENROLT 


.437 


187 


4H1 


^11 


SSTEAM 


-.417 


.275 


- 42i 


47^ 


SPUTEAR 


-.095 


.032 


-.110 


0S8 


SQUALMT 


.0698 


.246 


.784 


499 

.*tfc7 


TbiiX 


-.038 


.266 


014 

.V ATt 


463 


TNSTUDS 


.012 


.014 


.020 


.023 


7MTHSUB 


1.836 


.344 


2.398 


.593 


TXTBOOK 


.948 


.266 


.978 


.161 


TWORKBK 


-.0300 


.167 


.-.499 


.291 


TVISMAT 


.353 


.269 


.363 


.468 


TORDERl 


-.024 


.008 


-.027 


.013 


TSEATl 


.005 


.004 


.006 


.006 


Variance 


37.949 


6.160 






Pupil-level Variance 






35.868 




Pupil-level Sigma 






5.989 




Group>leveI Variar.ce 






2.285 




Group-level Signvi 






1.512 


0.174 


Deviance 






18088395 
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According to our exdusior criterion (t-ratio < 1) we now delete from the 
fixed part of the model the following six group-level variables: SPQSl, SSTREAM, 
TSEX, TNSTUDS, TVISMAT, and TSEATl. 

Third mo^^el- As before, we estimate this model with both smaller and larger 
data sets. For the former, OLS and VCS model estimates for this reduced list of 
variables are given in Display 4; the same schools and pupUs are involved as for 
Display 3. For the latter, 3025 students in 86 schools, we fit *e reduced model (17 
variables). The results are given in Display 5. Again, the difference in deviances (3.5, 
ceh is small The effects of non-random missingness can be checked by comparison 
of the estimates in Displays 4 and 5. Applying our exclusion criterion to the 
variables in this model we find that no further reduction of the list of explanatory 
variables is now possible. 

We note that, owing to the relatively small number of schools, the 
appropriate conclusion about the 14 group-level variables is that we "have found 
insufficient evidence" of a systematic effect of these variables, rather than"our 
analysis disproves their effects". Also, a different modelling scheme could lead to a 
different "minimal" set of important explanatory variables. Because of collinearity, 
there may be a set of altemaHve regression formulas that give a model fit which is 
not substantially inferior to the one given in Display 5, in terms of the deviances. A 
summary of the results of these analyses is provided in Table 4. 
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Display 4: OLS and VCS Model Estimates for 2804 Students and 
80 aaasrooms/Schools Using 17 Explanatory Variables, 
Thailand 1981-82 



a& VCS 

Variable Estimate St. Error Estimate St. Error 



GRAND MEAN 


17.321 




17.694 






./Ul 


.Ul/ 




.Ulo 


SAGE 


-.077 


.014 


-.073 


.014 


XSZX 


.676 


.247 


1.066 


.270 


YFOCa 


.181 


357 


.085 


.365 




-.419 


387 


-.465 


.385 




.105 


.455 


.062 


.457 


YMEDUC 


.293 


.280 


.288 


.276 




.112 


.465 


.154 


.458 




.014 


363 


0.297 


.564 


YMOREED 


.869 


.476 


.786 


.467 






ATA 
A/O 




.46o 




1.666 


520 


1.560 


.512 


YPARENC 


393 


.117 


.377 


.116 


YPERCEV 


-.1.076 


.137 


-1.130 


.136 


YFUTURE 


.592 


.142 


.537 


.141 


YDESIRE 


.477 


.201 


.431 


.197 


SENliOLT 


.285 


.164 


.367 


.289 


SPUTEAR 


-.074 


.030 


-.094 


.054 


SQUALMT 


.808 


.239 


.880 


.427 


TMTHSUB 


1.950 


.329 


2.562 


.576 


TXTBOOK 


.948 


.259 


.946 


.458 


TWORKBK 


-.433 


.160 


-.''02 


.284 


TORDERl 


-.022 


.006 


-.024 


.010 


Variance 


38.065 


6.170 






Fupil-Ievel Sigma 






35.871 




Pupil-level Variance 






5.989 




Group-levd Variance 






2.429 




Group-level Sigma 






1.558 


0.176 


Deviance 






18091.983 
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Display 5: OLS and VCS Model Estimates for 3025 Students and 
86 Qassrooms/Schools Using 17 Explanatoty Variables, 
Thailand, 1981-82 



Variable 



GRAND MEAN 

XROT 

XAGE 

x:.EX 

YFOCa 



YMEDUC 



YMOREED 



YPARENC 

YPERCEV 

YFUTURE 

YDESIRE 

SENROLT 

SPUTEAR 

SQUALMT 

TMTHSUB 

TXTBOOK 

TWORKBK 

TORDERl 

Variance 

Pupil-level Variance 
Pupil-level Sigma 
Group-level Variance 
Group-level Sigma 
Deviance 



Estimate 


St. Error 


Estimate 


jlU 

St. Error 






17.536 


- 




.1/1/ 


.629 


.018 


-.075 


.014 


-.071 


.014 


.658 


.238 


1.053 


.260 






-.435 


373 






-.435 


373 


lis 




•123 


.446 






.343 


.265 




AAQ 
Amy 


.073 


442 


Ofji 


CCA 


OCA 

-.259 


.555 






.755 


.453 


1.195 






.452 


1.703 


.500 


1.532 


.494 




.i lO 


.34/ 


.112 


-1.140 




1 101 


.132 


.614 


137 




.136 


.484 


194 




1 OA 


.271 


160 






-.076 


.029 


-.094 


.052 


.847 


.232 


.903 


.410 


1.968 


.327 


2.546 


.566 


1.047 


.250 


1.071 


.437 


-.434 


.157 


.-.417 


.275 


-.023 


.006 


-.025 


.010 


38.271 


6.186 










36.138 








6.012 








2.353 








1.534 


.169 






19537.962 
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Table 4: Summary of Displays 1-5 



OLS Variance 


3803 


38.11 


37.95 


38.07 


38.27 


St, error 


617 


617 


6 16 


6 17 


6 19 


VCS Pupil-level Variance 


36^1 


36.96 


35.87 


35.87 


36.14 


Sigma 


6.07 


6.08 


5.99 


5.99 


6.01 


VCS Group-level Variance 












ForGmean 


1.32 


135 


2.29 


2.43 


2.35 


Sigma 


1.15 


1.16 


1.51 


136 


1.53 


' St error for Sigma 


0.19 


019 


0.17 


0.17 


0.17 


Sample size 












Pupils 


2076 


2076 


2804 


2804 


3025 


Ctoap$ 


60 


60 


80 


80 


86 



Modelling of group-level variation (r andom slopes and random differences) 

Simultaneously with reducing the fixed (regression) part of the variance 
component model for our data, we also reed to explore extensions of the random 
part in order to obtain a better description of the group-level variation inan the one 
offered by the group-level variance. We have concentrated first on reduction of the 
fixed part to a shorter list of acplanatory variables because: (a) the school-level 
variation is rather small, and (b) in the models with complex description of 
variation, the fixed effect estimates and their standard errors differ very little from 
the obtained so for (Display 5). 

In the variance component models fitted so far (Displays 1-5) the within- 
group regressions are assumed to be constant aaoss groups, with exception of the 
intercept (position) which has a fitted variance of 235. More generally, the 
regression coefficients with respect to any of the pupil-level variables may be allowed 
to vary across the groups. These variables, selected fix)m the variables included in 
the fixed part, form the random part of the model. The group-level variables are not 
considered for the random part, because within-group regressic : with respect to 
such variables cannot be identified. 

Variance component models closely resemble the models for analysis of 
covariance. The simple variance component models correspond to ANCOVA 
models with no interactions of covariates with the grouping factor. The (complex) 
variance component models with variable v/ithin-group regressions (slopes and /or 



I4i 



m 

differences) correspond to ANCOVA models with group x covariate interactions. 
The difference between the variance component and ANCOVA models is in the 
emphasis on descripHon of variation as opposed to differences among the groups 
and in the assumptions of normaUty of the group effects in the former. The model 
specification in both models is analogous: 

a, list of covariates (fixed part), 

b, sublist of covariates which have interactions with the grouping factor (random 
part). 

We now turn to modelling of the random part. For a continuous variable 
included in the random part the within-group regression slopes with respect to this 
variable are assumed to be randomly varying (and nonnally distributed) with an 
unknovn variance. For a categorical variable included in the random part the 
within-group (adjusted) differences among the categories are nonnally distributed. 
We can consider the 'stereotype' group, for. which the regression is given by the fixed 
part model (the average regression), and the regressions for the groups vary around 
this average regression. The deviations of the regression coefficients form a random 
sample (i.Ld) from a multivariate normal distribution. The components of the 
vector of deviaHons (for a grcup) cannot be assumed to be independent, and so their 
covariance structure has to be considered, but the variances of thess deviations (or 
random effects) are of nwin interest. 

Data with only a moderate number of groups, as is the case in this analysis, 
contain only limited information about variation, comparable to the limited 
information about interactions in models of analysis of covari=mce. Information 
about the covariance structure is usuaUy even scarcer. Therefore, if a 'large number 
of variances are included in the random part (and estimated as free parameters) we 
can expect high correlations among the estimates— large estimated variances with 
large standard errors. Also, the number of covariances to be estimated grows rapidly 
with the number of variances, and many of the estimated correlations corresponding 
to the« covariances are then close to +1 or -1. The variance matrix with these 
variances and covariances is not of full rank, and the random effects are Unearly 
dependent. Therefore it is important to adhere to the principle of parsomony and 
seek the simplest adequate description for group-level variaHon. In selection of 
covariances to be estimated we use the guidelines set by Goldstein (1987) and 
Longforel (1987). 

Although model selection for the random part involves only pupil-level 
variables (inclusion/exclusion), it is more complex than the selection for the fixed 
part because constraints can be imposed also on the covariances. The most genial 

ER?C ( .} 7 
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variance component model would involve 17 variances (the number of regression 
parameters in Display 5) and 17xl6/2«136 covariances* Fitting such a model is clearly 
not a realistic proposition, and so model selection has to proceed by building up the 
random part from simpler to more complex models* 

In model selection for the random part we have proceeded in the following 
stages. For all the nuxlds we used the same fixed part as in Display 5* The estimates 
and standard errors for the regression parameters differed very slightly from those in 
Display 5 for all these modeb. The estimates and standard errors for the regression 
parameters differed very slightly ftx>m those in Display 5 for all these models. This 
fact justifies post poc oiu* approach of flirst settling the fixed part and then proceeding 
with modelling of the random parts* Brst we fitted modeb with one pupil-level 
variable in the random part Using the likelihood wtio test to compare the fitted 
model to the model with simple random part (Display 5) we selected the following 
variables: XROT, XAGE, \DESIRE AND YMOREED 

The first three variables are ordirud, and as2>ociated with one variance each. 
The likelihood ratio (difference of deviances) fc" each of the three corresponding 
modeb was larger than 3. Thb is a very consa .ative criterion, sirKe we prefer to err 
on the side of inclusion. There are two parameters - a variance (slope-variance) and 
a covariance (slope-by-intercept covariance) involved, but they are not free 
parameters since they have to satbfy the condition of positive definiteness. The 
distribution of the difference of the deviances is C2^ if the correlation corresponding 
to the covariance is smaller than 1 in modulus. The problem of negative variances 
is resolved by estimating the square roots of the variances (sigmas). 

Next we fitted the VC model with thisse four variables in the random part, 
and simplified the random part by excluding variables and setting certain 
covariances to 0. The variance associated with the variable XAGE was very small 
(.00095) and its square root had a low t-ratio (.75), and so it could be coitstrained by 0 
(excluded). That implies a constraint on all the covariances involving XAGE which 
are also set to 0. The three remaining variables and the intercept are represented by a 
6x6 variance matrix; 6 variaiKes and 15 covariances, almost as many parameters as in 
the fixed part. The fitted variance matrix is 
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Intercept 2581 

XROT .0143 .00558 

YMOREED cat .191 .0388 .812 

cats 519 .0439 .0621 1.032 

cat 4 384 .0354 -.0241 .261 1.032 

YDESIRE .0863 -.0127 -.307 -.303 -.346 .667 

The decrement in deviance, compared with the VCS model (Display 5) is only 13, 
hardly warranting addition of these 21 parameters in the model 

The software used provides staiKlard errors for the square roots of the 
variances (sigmas, diagonal dements of the matrix) and for the covariances. The 
sigmas and their stand;>rd errors are: 
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Intercept XROT cat2 cat3 cat4 YDESIRE 

Sigma 1.607 .0747 .901 1.175 1.016 .828 

St Error .176 .0261 .429 .451 .640 .295 

The standard error for the covariances involving XROT and categories of 
YMOREED (rows 3-5 in column 2) are between .059 - .063 and for those involving 
YDESIRE and YMOREED (columns 3-5 in row 6) are .56 - .62. Each of these 
covariances have a small t-ratio, and so they were constrained to 0 in the next model. 
The following estimated variance matrix was obtained (the s'.gmas and their 
standard errors are given to the right of the variance matrixV 



2.415 

.0455 .00390 

0 0 0 

1.136 0 0 

.740 0 0 

.304 -.0436 0 



The rank of this matrix is 4 (the two variance matrices given above are also 
singular), and so it would appear that another variance parameter could be 
constrained to 0. Howeve., the t-ratio for each of the sigmas is high, and only a 
complex linear reparametrization of the variables included in the random part 
would enable further model simplification. The variance matrix obtained provides 
a description of group-level variation in term- if 11 parameters, 5 variances and 6 
covariances. But the difference of variance of this model and the corresponding VCS 









Sigma 


St. Error 








1.554 


.16' 








.0625 


.0313 








0 


0 


1.788 






1337 


.341 


1.157 


1.424 




1.193 


.514 


0 


.0 


.830 


.911 


.260 
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model is only 11 (for 10 paramew'm). Ttat provides further evidence of 
overparametrization or colUnearity in the random part. However, any atempt to 
define a suitable model with fewer parameters would necessarily involve some 
unnatrually defined variables, which * i^em interpretation of the model very 
difficult. 

Variation in the slope of XROT provides evidence of unequal *converi5on' of 
ability at the beginning of the year into ability at the eiKl of the year. Such a 
conclusion is appropriate only subjeci to the caveats discussed In the Sianmary. The 
slope of XROT is shallower in some schoob, where the initial differences in XROT 
tend to be associated with smaller differences in YROT than in schools where th ^ 
slopes are steeper. 

The regression slope for YDESIRE is abo'^t .5 — this is the regression slope for 
the 'stereotype* school, where every feature is average*. The variation ai^sodated 
with this regression slope has a standard deviation of .9, and so there is a large 
(predicted) proportion of schools where the slope on YDESIRE is very small, or even 
negative! The correlation of the within-group slop«^ on XROT and YDESIRE is -.77; 
lower 'effects* of motivation to succeed are associated with schools where the initial 
differences become exaggerated by the end of the year. 

The variances associated with the categories 3 and 4 of YMOREED represent 
the variation of the adjusted differences between categories 3 and 1 and 4 a .d 1, 
respectively. While the fitted difference between categories 2 and 1 is about .8, and 
constant for all the schools, the average within-school difference between categories 
3 and 1 is l.H, with a variance of 1.8. Therefore this difference is negative in several 
schools. The situation with the 4-1 cor .ast is similar, although the number of 
schools with reversed sign of the difference is much .smaller. The correction of the 
random effects associated with the categories 3 and 4 is .725 - high 3-1 contrast is 
associated with a high 4-1 contrast, but the fitted variance for the contrast 4-3 is \ J j + 
1.42 - 2^.16 - .89, whereas the average difference is 1.58 - 1.08 - .50. Hence there are 
schools where the pupils with YMOREED 3 have lower adjusted scores on YROT 
than YMOREED = 4, although on average the 4th category is .5 points ahead. 

The estimates of the regression parameters differ only marginally for the 
different specifications of the random part. This justifies, post hoc, our approach of 
modelling first the regression part of the model and then the random part. The 
regression estimates for the last model considered are given in Display 6. 
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Display 6: Fixed-effect Estimates for the Final ModtJ with Random 
Effects, for 3025 Students and 86 QassFooms/Schoob Using 
18 Explanatory Variagles ThaiUnd 1981-82 



Variable 




bt. error 


GRAND MEAN 

XROT 

XAGE 

SXES 

YFOCa 


16.642 

.01/ 
ATA 

1.143 
.101 


.020 
.014 
.260 
352 




-.488 


374 




.198 


.446 


YMEDUC 


347 


.268 




.062 


.446 




-.491 


.560 


YMOREED 


.816 


.453 




1.117 


.476 




1.618 


.514 


YPARENC 

YPERCEV 

YFUTURE 

YDESIRE 

SENROLT 

SPUTEAR 

SQUALMT 

TMTHSUB 

TXTBOOK 

TWORKBK 

TORDERl 

TSEATl 


-1.178 
.526 
.480 
.300 
-.063 
.781 
2.632 
0.949 
-.372 
-.035 
.007 


.112 
.133 
.137 
.217 
.265 
.048 
380 
582 
.431 
.270 
.270 
.006 



Variance 

Pupil-level Variance 
pupil-level Sigma 
Group-level Variance 
Group-level Sigma 
Deviance 

Number of iterations 



35.259 
5.938 

See matrix given in the text. 

19064.902 
8 
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Conditional expectati ons of the random effects. 

In the fixed-effects ANOVA or ANCOVA, estimates of the effects associated 
with tlie groups are obtained. In variance component models these effects are 
represented by random variables. Conditional upon the adopted model the 
expectations of the (random) group-effects can be considered as the group-level 
residuals, or as "estimates" of the group-effects. These conditional expectations have 
to be inspected whether they conform with the assumptions of normality. This 
inspection involves a check for skewness and kurtosis (not carried out here, but 
visual inspection indicates no problems), and a chec^ outlying values of the 
effects. The latter check is obviously also of substantive importance because it would 
be useful to detect schools with exceptionally high or low performance, where the 
categories of YMOREED have substantially different differences than the average 
school, in which schools the outcomes are more/less influenced by the initial score 
XROT. The complex nature of variation, involving three variables, coupled with 
the niunber of groups, makes it infeasible to discuss the deviations of the group- 
level regressions from the average regression. In fact, the main motivation in use of 
variance component analysis has been to obtrain a global description of variation, 
without reference to the individual groups. The added advantage is that owing to 
the shrinkage property of the conditional expectations extreme results due to 
unreliability for some of the schools with smail ntunbers of students are avoided.. 
The conditional expectations are a mixture of the pooled ordinary least squares 
solution of the within-group regression; the weight depends on the amount of 
information contained in the data from the group. Conditional expectations are 
obtained number of regression parameters. Owing to this shrinkage we cannot 
pinpoint to all the schools where (say) the difference of the categories 3 and 1 has a 
negative sign. For several schools the conditional means indicate a small difference 
between the categories; some of these may be negative, others positive and larger 
than the conditional expectation. Accordingly we should downscale our notion of 
what is an exceptionally large deviation; say, 1.5 multiple of the standard deviation 
(sigma) should be regarded as exceptional. 

We conclude with an example of an exceptional school. School 22 (42 pupils 
in the data) has all its random-effects components positive. Its deviation from the 
average regression formula is 

1.517 + .100 XROT + .102 YDESIRE + 1.008 YM3 + .842 YM4, 
where YM3 (and YM4) are equal to 1 if the pupil is in category 3 (4), ard 0 otherwise. 
This indicates that it is a school with high performance where t^p differences in 
initial abOity tend to get exaggerated, pupils with high moti\'arion and high 
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expectations are at an advantage. For sample mean values of XROT and YDESIRE 
this formula becomes 

2.959 + 1.008 YM3 + .842 YM4, 
^hich reflects the high "performance* of the school much more clearly. The 
variances quoted above refer to a regression using centred versions of all the 
variables 



(XROT • XROT , YDESIRE - YDKSIRE , YM3 - YM3 , YM4 - YM4). 

In the transformation from one parameirization to the other only in j intercept- 
variance is affected. 



DISCUSSION 

At the outset of this paper, we posed three substantive and one 
methodological questions: (a) What characteristics of teachers and schools enhance 
student achievement?, (b) Are these effects uniform across different students?, (c) 
What is the comparative effectiveness of alternative inputs?, and (d) How do 
estimates obtained from simple OLS methods compare with estimates obtained from 
multilevel methods? During the development of the analysis, a fifth question arose: 
Are there alternative regression models that predict student achievement equally 
well as the model developed herein? In this section, we review cur findin<5S and 
present some caveats about their interpretations. 



Summary 

Effective teacher and sc hool characteris tics. The results from our final 
ana. /sis indicate tiiat there are teacher and school characteristics that ^re positively 
associated with student learning. These are: 

• the percentage of teachers in the school that are qualified to teach 
mathematics, 

• an enriched mathematics curriculum, and 

• the frequent use of textbooks by teachers. 

At the same time, some teaching practices are negatively related with learning; for 
this sample they are: 

• the frequent use of workbooks, and 
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« time spent o maintaining order in the classroom. 
The positive results arc not surprising. Teachers who know the subject nuitter being 
taugjlit/ a currioiluir that covers the donudn, and textbooks that provide a structured, 
presentation of the material all should have positive effects on achievement. The 
negative results are more curious. On the one hand, teachers who spend a great deal 
of time maintaining classroom order will have less time available for teaching; 
therefore, less learning takes place. On the other hand, the use of workbooks ought 
to contribute positively to achievement/ not detract from it Possibly the use of 
workbooks substitutes for sot aething else: dirert instruction/ perhaps. 

Uniformity of effects, ^n this sample/ we found that schools dkl not have 
imiform effects on all students. In particular/ effects differed according to the level of 
education expectations held by the students. Some schools/classrooms were more 
effective for students with low expectations/ some were more effective for students 
with high expectations/ while other schools are equally effective (or ineffective) for 
all types of students. Interestingly enough/ we foimd little evidence that schools 
were differentially effective for students on the basis of sex, age, parental occupation 
or several other student attitudes. Thus, Thai schools were operating/ by and large, 
in an egalitarian fashion/ with the one exception of differences according to 
educational expectations. 

Comparative effectiveness of inputs . Overall/ we found few school "inputs" 
that were associated with differential achievement over time. Frequent use of 
textbooks increased achievement by a full point on the posttest/ while xise of 
workbooks decreased achievement by a third of a point; an enriched curriculum 
increased posttest scores by over 2.5 points. Each additional percentage of teachers 
that were qualified to teach nuithenuitics raised posttest scores by over one point. 

However/ these causal statements do not hold if they are to be interpreted as a 
result of an external intervention. Obtaining (additional' toctbooks for the schools is 
not a simple procedure unrelated to educational processes and management 
decisions; it is itself an outcome variable related to some (unknown) aspects of the 
educational process. Similarly/ discarding workbooks would not lead to improved 
outcomes/ unless all the circumstances that lead to reduced use of workbooks are 
also present/ or are induced externally. External intervention will be free of risk only 
if we have/ and apply/ causal models for how the educational system functions. The 
models developed in this p>aper/ and elsewhere in educational research literature, are 
purely descriptive. Use of regression methods/ and of variance component analysis, 
allows improved description/ but does not provide inference about causal 
relationships. 
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Also, interpretations of estimates of effects are subject to a variety of 
influences, and there may be alternative regression models, with different variables, 
that are equally correct in terms of prediction. Thus, the selection of variables 
included in this model is responsible, to some degree, for the results, and a different 
seleciton of variables could yield substantially different results with rc ^ . ct to the 
contribution of each variable. 

Cpmp^rison v^th QL§. The analysis carried out demonstrates that estimates 
based on OLS regressions do yield different results, in some cases, to those based on 
VC regressions. For example, in comparing the OLS estimates with the VCS 
estimates in Display 6, we see that for TMTHSUB, the coefficients are quite diffe-ent. 
Using OLS, we would conclude that students in "enriched" classes, controUing for 
the other e::planatory variables, perform about 2 points (13%) higher than those in 
"normal" or "remedial" classes; the conclusion based on the VC regression is that 
they perform nearly 2.6 points (17%) higher. Combining these effect with cost 
information permits an estimation of cost-effectiveness. If enriched classes cost 13% 
more than remedial classes, we would conclude that they were either equally cost- 
effective (OLS) or more cost effective (VC) than remedial /normal classes, depending 
on the model. Similariy, if (jnriched classes cost 17% more than remedial/normal 
classes, they would be either equally cost-effective (VC) or 1^ cost-effective (OLS), 
depending on the model. However, the caution about the causal inference in the 
previous subsection equally applies in t..is context. Classes, or schools, cannot be 
declared to have enriched curriculum at an external wiU and by supplying the 
outward signs of having enriched rarriculum; rather, a whole Cvomplex of related 
circumstances have to be arranged, e.g., strengthened education in lower grades, 
synchronization with other subjects, etc. Since we have argued eariier in the paper 
that estimates based on VC methods are preferable to those based on OLS methods, 
differences of these types could hold important policy implications for schools 
deciding on the type of curriailum to choose. 



Caveats 

We have noted that alternative models could yield similar predictions (in 
terms of achievement), but might include a different set of variables. That such 
could be the case is not a problem limited to VC models; it is a perennial problem 
with these general types of analyses. In our analysis, we have included a number of 
individual pupil and school /classroom variables; in this respect, we have moved 
well beyond earlier models, which include only modest "intake" characteristics of 



students. Having identified the variables associated with higher outcome scores 
does not offer a direct answer to the principal question of a development agency 
about distribution of its resources to a set, or a continuum, of intervention policies 
in an educational system. Without any prior knowledge of the educational system, 
any justififoation for an intervention policy based on the results of regression (or 
variance component) analysis, or even of structural modelling (USREL), would 
have no proper foundation. Certain intervention policies may cause a change in the 
educational system, and hence a change in the regression model itself. This new 
regression model may indicate that the selected intervention is far from optimal, or 
may even be detrimental. 

A case in point is the pretest score XROT. Its coefficient is positive and of 
substantial magnitude. A conceivable intervention policy would be to raise the 
XROT scores, for example, by coaching prior to pretest adntinistration. Clearly such 
an intervention, if effective, could lead to a change in the regression formula. 
Alternatively, if coaching took place between the pretest and posttest 
administrations, the regression formula would again be changed, out differently. 
Any number of different scenarios are easy to construct, in which the coefficient on 
XROT would be close to 1, or substantially lower than .62 (obtained in our analysis). 

Similarly, indiscrtminant reduction of the time spend cn maintaining order 
in the classroom, probably a less expensive intervention in monetary terms, is likely 
to be an unreasonable solution. Introduction of the enriched mathenwtics 
curriculum for all students is most likely not practicable, and even its extension for a 
few more classroon^ may place excessive requirements on staff in the schools, thus 
lowering the quality of instruction in other subjects, and/ or other grades. 

In conclusion, positive or negative regression coefficients cannot be regarded 
as indicators of cause, effect, or influence. An intervention could be regaided as an 
experiment, and its outcome can be predicted from an observational study only 
under the uxtrealistic assumptions of the regression formula describing accurately 
the mechanics of a rigid educational process. 

Three important items of information would assist in answering the question 
about allocation of resources: 

1. Feasibility and cost of various interventions. 

Z How will an intervention effect other explanatory variables and which 
aspects of the educational process will remain unaltered after the 
intervention. 

3. How directly manipulable are the "interventions"? 
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It is key to make distinction between variables that are manifest 
(unchangeable, e.g., pupil background), that are manipulable (e.g., time spent on a 
task of a particular kind), and that are manipulable only by direct intervention. For 
example, the time spend on maintaining discipline is a manipulable variable, but it 
can be either manipulated indirectly (e.g., by making the curriculum more 
interesting by providing more suitable or more interesting textbooks, or directly 
(through changing teacher behavior, so as to ignore disruptive student behavior). 
Effective education policy considerations require attention to directly manipulable 
variables; in the present analysis, these are the qualifications of the mathematics 
teachers in the school and the use of textbooks. 



^These hierarchical structures result from design elements (stratified sampling), data 
collection technicalities (e.g., interviewer effect) or intrinsic interest in cross-level effects (e.g., 
the effects of post-natal feeding programs on the relationship between birth weight and 
subs^uent cognitive development). 

^An extended discussion of this is provided by H. Goldstein (1987). 

^For more detail on the construction of the achievement measures, see Lockheed, Vail & 
Fuller, 1986. 
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INSTRUCnONALLY SENSITIVE PSYCHOMETRICS: 
APPLICATIONS )F THE SECOND INTERNATIONAL MATHEMATICS SlUD Y 

Bengt O. Muthto 
CRESST and 
Graduate School of Education 
University of California, Los Angeles 

1. Introduction 

This paper discusses new psychometric analyses that improve capabilities for 
relating perfonnance on achievement test items to instruction received by the 
examinees. The modeling discussion will be closely tied to data for U.S. eighth grade 
students provided by the Second International Mathenuitics Study (SIMS), comprising 
not only responses to a set of achievement items at the beginning and end of the eighth 
grade but also a relatively rich set of student backgroimd information, including 
opportonity-to-leam (OTL) information specific to each item (Crosswhite, Dossey, 
Swafford, McKnight, & Cooney, 1985). 

Item Response Theoi^ (IRT) is a standard psychometric approach for analyzing a 
set of dichotomously scored test items. Standard IRT modeling assumes that the items 
measure a unidimensional trait* This particular kind of latent trait model is used to 
assess the measurement qualities of each item and to give each examin^^e a latent trait 
score. As will be shown, however, IRT modeling is limited in ways that are a hindrance 
to properly relating achievement responses to instructional experiences. Taking IRT as a 
starting point, this paper summarizes the author's work on a set of new analytic 
techniques that give a richer description of achievement-instruction relations. Six topics 
that expand standard IRT and specifically deal with effects of varying instructional 
opportunities (OTL) will be discussed as outiined below. 

1. Variation in latent trait measurement characteristics. Tnis relates to the 
classic IRT concern of "item bias," here translated as the absence or presence of an added 
advantage due to OTL in getting an item right. 

1 Multidimensional modeling. Inclusion of narrowly N defined, specific 
factors closely related to instructional units in the presence of a general, dominant trait. 

3. Modeling with heterogeneity in levels. Analyses that take into account that 
achievement data often are not sampled from a single student population but one with 
heterogeneity of psrformance levels. 

4. Estimation of trait scores. Deriving scores based on both performance and 
background information for both general and specific traits. 
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5. Predicting achievement. Latent trait modeling that relates to trait to student 
background variables. 

6. Analyzing change. Relating change in general and specific traits to OTL. 
The SIMS data will be used throughout to illustrate the new methods. All 

analyses will be carried out within the modeling framework of the LISCOMP computer 
program (Muthen, 1984, 1987). 

Section 2 describes the SIMS data to be analyzed. Section 3 describes general 
features of the psychometric problem. Section 4 presents a descriptive analysis of the 
achievement N instruction relation for the SIMS data and sets the stage for later 
modeling. Sections 5-10 discuss methods topics 1-6 listed above. 

2. The SIMS data 

The Second International Mathematics Study (Crosswhite, Dossey, Swafford, 
McKnight, & Cooney, 1985) was conducted in order to study variations in mathematics 
knowledge for eighth and twelfth graders within and across several countries To this 
aim, multiple-choice mathematics achievement responses were collected on items in 
the areas of arithmetic, algebra, geometry, measurement, and statistics. The test was 
administered both in the Fall and in the Spring of each grade. The achievement test 
consisted of 180 items distributed among five test forms. Each student responded to a 
core test of 40 items and one of four randomly assigned rotated forms with about 35 
items. For the part of the sample that we will be concerned with, the core test was 
administered both during the Fall and the Spung to all students in the study while the 
rotated forms varied in their use pattern. It is well known that eighth grade 
mathematics curricula vary widely, certainly for students in the U.S. To be able to better 
describe the variation in student math achievement, information related to these 
curricular differences was there .e also collected. A detailed part of this information 
was opportunitv-to-leam (OTL) for the topics covered by each test item. For the U.S. 
eighth grade math students, information was also collected in order to make a 
distinction between "tracks" or class type, yielding a categorization into Remedial, 
Typical, Enriched, and M^ehra classes. This classification was based on teacher 
questionnaire data and on information on textbooks used. A variety of other teacher- 
related information was also collected, such as topic emphasis, and teaching style. 
Student background information on family, career interests, and attitudes was also 
collected. We will concentrate our analysis on the U.S. eighth graders (for whom there 
are about 4,000 observations from both Fall and Spring) sampled from about 200 
randomly sampled classrooms varying in size from about 5 to 35 students. We will be 
particularly concerned with analyses of the 40 core items, but will also report on analyses 
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of the four rotated forms which, when combined with the core items, represent about 75 
items administered to the about 1,000 students taking each form. The rotated form 
analyses will be presented as a cross-validation of findings for the core items. In this 
way, the SIMS data provide a uniquely rich set of data with which to study 
instmctionally N sensitive psychometrics. 

In the analyses tk>: follow, a key piece of instructional information was obtained 
from the teacher questionnaire. For each item, teachers were asked two questions 
regarding opportunity to learn. 
Question 1: 

"During this school year did you teach or review the mathematics needed to 
ariSwer the item correctly?" 
1. No 
Z Yes 

3. No response 
Question 2: 

"If in the school year you did not teach or review the mathematics needed to 
answer this item correctly, was it mainly because?" 
1. It had been taught prior to this school year 
Z It will be taught later (this year or later) 

3. It is not in the school curriculum at all 

4. For other reasons 

5. No response 

Lsing these responses, opportunity-to-!earn (OTL) level will be defined as; 
No OTL: Question 1 (= 1), question 2 (= 2, 3, 4, or 5) 
Prior OTL: Question (1 = 1, or 3? and question 2 (= 1) 

This Year OTL: Question 1 (= 2), question 2 (= 9 (other response combinations 
had zero frequencies) 

In most analyzes to follow. Prior OTL and this Year OTL will be combined into a 
single OTL category. 

3. The General Problem 

In general, psychometric modeling assumes independent and identically (lid) 
distributed observations from some relevant population. This assumption is also made 
in IRT. The assumption of identically distributed observations is not realistic, however, 
using data of the SIMS kind to describe either relationships between what is measured 
(achievement responses) and what the measurements are attempting to capture (the 
traits), or how traits vary v/ith relevant covariates such as instructional exposure and 
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Student background. This is because of the instructional heterogeneity of the students 
analyzed. The distrib ton of responses conditional on various traits values cannot be 
expected to be idenHcal for a student who has had no specific instruction on the item 
topic and a student who has had instrucHon. The trait distribution cannot be expected io 
be the same for students in enriched classes as for students in typical classes. The 
students are naturaUy sampled from heterogeneous popub'ions. It is tnie that 
inaeased homogeneity can be obtained by dividing the students into groups based on 
instructional experiences. However, such groupings may have to be very detailed to 
achieve their purpose and any simple grouping may be quite arbitrary. A more 
satisfactory approach is to use modeUng that allows for heterogeneity, using parameters 
that vary ,or varying instmctional experiences. Such unodeUng also accomplishes the 
goal of instructionaliy sensitive psychometrics, namely explidUy describing the 
achievement response-instructional experiences relations. 

4. Descriptive analyses 

Its informative to consider descriptively how the achievement responses vary 
with instructional exposure. This forms a basis for our subsequent modeling efforts. 
We will study this in terms of both univariate and bivariate achievement distributions 
using the posUest core items admiristered to the U.S. eighth gradere. We will also study 
the change in univariate responses from pretest to posttest. 

4.1 Univariate response 

Consider first the univariate responses for the posttest. The wording of the core 
items is given in the appendix. The proportion correct for each item is described in 
Table 1, broken down by the class type c^■tegories Remedial, Typical, Enriched, and 
Algebra and by the OTL categories No OTL, This Year OIL, and Prior OTL. From the 
totals it is seen that both class iype and OTL have a strong effect on proportion correct. 

For most items the proportion correct is higher for Enriched and Algebra classes 
than for Remedial and Typical classes. For almost all items the proportion correct 
inaeases when moving from No OTL to this Year OTL tc Mor OTL. The reason why 
Prior OTI. Jives higher proportion correct than This Year OTL is partly because Prior 
OTL is more common for Enriched and Algebra classes to which we presume students of 
higher achievement levek have been selected. OTc appeare to have an overall positive 
effect on proportion correct ako when controlUng for class type, at least for typical 
classes. Also, when controlling for OTL, class type seems to stUl have a stron- pffec t. 
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These univariate relatioriships are informative but confound effects of instructional 
exposure with effects of student achievement level. For example, the higher proportion 

TABLE 1 

Percentage Students and Percentage Conect for Core Items by OTL and Class Type 
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TABLE 1 

Percentage Students and Percentage Correct for Core Items by OTL and Class Type 

This Year on. Prior on. 
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TABLE 1 

Percentage Students and Percentage Correct for Core Items by OTL and Class Type 
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TABLE 1 

Percentage Students and Percentage Correct for Cc aems by OTL and Class Type 
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TABLE 1 

Percentage Students and Percentage Correct for Core Itetns by OTL and Class Type 



Item 


Total* 




m 2EL 


This Yv-ar OTL 


Prior OTL 




PR 


PO 


ST 


PR 


PO 


ST 


PR 


PO 


ST 


PR 


PO 


AK3o 
























TOT 


47 


56 


7 


44 


38 


86 


46 


56 


7 


64 


73 


REM 


33 


31 


19 


37 


31 


81 


32 


30 


0 


0 


0 


TYP 


44 


52 


8 


47 


41 


92 


44 


53 


0 


0 


0 


ENR 


51 


66 


4 


41 


32 


93 


52 


68 


3 


43 


57 


ALG 


66 


72 
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0 


0 


43 


65 


68 


57 


66 


75 


AR37 
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15 


21 


23 


65 


29 


36 


21 
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52 
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14 


12 


38 


11 
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62 


16 


14 
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26 


31 


17 


24 


24 


73 


27 


33 
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28 


32 
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36 


46 
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19 


30 


62 


36 


48 


32 


40 


46 


ALXj 


57 


69 
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39 


67 


24 


49 


63 


71 


61 


71 


AK3o 
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51 
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23 


91 


34 


51 
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61 


72 


REM 


16 


25 
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75 


17 


91 


16 


25 


0 


0 
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TYP 


31 


45 




27 


25 


97 


31 


46 
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0 
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ENR 


42 


66 


0 


0 


0 


97 


43 


66 
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35 


52 


ALG 


61 


0 


0 


0 


43 


57 


62 


57 


63 


74 
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35 


47 


47 


33 


41 
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37 


52 
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52 


56 
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24 


31 


93 


24 


31 
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21 


21 
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0 
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TYP 


32 


43 


46 


30 


38 


54 


34 


46 
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ENR 


39 


b6 


32 


35 


44 


66 


42 


63 


2 


22 


50 


ALG 


52 


60 


56 


53 


59 


19 


44 


68 


26 


57 


57 



• Percentage of students by class type are: 

REM = Remedial: 7.1 (N=268), TYP==Typical: 57.6 (N=2148) 
ENR^Enriched: 24.4 (N=909), ALG=Algebra: 10.7 (N=399) 



ST=Percentage students 
PR=Percentage correct for pretest 
PO=Percentage correct for posttest 

ME=nneasurement 
AR^Arithmetic 
AL?- Algebra 
GE=Geometry 



correct fc a certain item for students with Prior OTL ir ^y be solely due to such students 
having a higher achievement level on the whole test. It would be of interest to know if 
students with the same achievement level perioim differently on a certain item for 
different instructioital exposi^re. To this aim, we may consider the total score on the 
posttest as the general mathematics achievement level of each student and study the 
variation of proportion correct for each item as a function of instructional exposure 
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conditionaUy on the general achievement level. We liave carried this out using the 
dichotomous version of OTL, combining Prior OTL with This Year OTL into a single 
OTLcptegory 

For each value of the achievement variable we then have a proportion correct for 
a No OTL and an OTL group and can study whether OTL makes a difference. 
Conversely, for each of the two OTL categories we will present the distribution of the 
achievement variable in order to study whether having OTL for an item implies that 
these students have a higher general achievement level. These plots are given in 
Figures 1-9. 

Figure 1 describes items 1, 2, and 3. The lert-most panel shows the total score 
distribution given No OTL and OTL, respectively. We note that the score distributions 
have different locations with the OTL distribution having somewhat higher mean, 
supporting the notion that students who receive OTL perform better as measured by this 
test. We also note that the variances of the two distributions are about the same. The 
score distributions shown are representative of al.< core items. 

The right-most part of Figure 1 and Figures 2 - 9 contain curves showing the 
proportion correct for given total score for the two 0\ L categories. For each item and 
both OTL categories, proportion correct increases with total score indicating that for both 
OTL categories the item is a good indicator of the general achievement variable which 
the total score representi,. It is particulariy noteworthy that this is true also for the No 
OTL category and that the No OTL and OTL curves most often are very close. The 
students who, according to their teachers, have not been taaght the mathematics needed 
to answer the item correctly still appear to have a high probabUity of answering the item 
correctly and this probability increases with increasing total score. This may indicate that 
students can to a large degree draw on related knowledge to solve the item. It may also 
indicate unreliabUity in the teachers' OTL responses. However, the differences in score 
distributions for the core items show that the OiX measures h. ve consistent and strong 
relations to the total score. Instead of unreliability there may be a component of 
invaUdity involved in the teachers' responses, where OTL may to some extent be 
confounded with average achievement level in the class and/or the item's difficulty. 

The score distributions show that OTL is correlated with performance. Our 
hypotheses is that OTL helps to induce an increased level of general achievement 
variable and that in general it is .his inaeased level that increases the -.obabiUty of a 
correct answer, not OTL directly. In this way, moving from the No OTL status to the 
OTL status implies a move upwards to the right along the common curve for No OTL 
and OTL. 
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FIGURE 1 
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FIGURE 3 

Proportion Correct: No OTl (square)/OTL (triangle) 



Core Teit - Hem 8 



1.0 

•.f 

M 
•J 
0.1 
l.i 
•.4 
1.9 
0.2 
•.t 
•.0 




9 S 



t» 20 » 



40 



Core Test - Item 9 



t.o- 

O.f- 
M 
0.7- 
•J 

o.s 

•.4 
•.J 

o.r 

0.1 
0.0 




to 



19 



rrpff If »«•■!-■ 

20 29 
TettI Sctrt 



)0 



tff|fVW*V*V**J 

99 40 



1.0 
0.f 
0.0 

0J 

0.1 
0.9 
0.4 
i.I 
0.1 

t.t 

0.0 



1.0 
0.0 

0.0 

0.7 
0.1 
0.9 
0.4 
0.3 
0.2 

0.H 

0.0 



Core Test - Item 1 1 




Core Test - Item 12 




9 



to 



•M |«if"» ••••I • 

t) 20 2) M 



3) 



40 



85 



Tttol Scort 



174 



PI CURB 4 



Proportion Correct: No OIL (square)/OTL (triangle) 



Core Test - Item 13 



f 



t.i 

M 
•J 

1.7 
M 
•.» 
1.4 
1.1 

1.1 




frrrr 



10 M 20 25 



50 



55 



40 




Cor© Test - Item 14 



tj 

OJ 
OJ 

0.7 

0.1 

0.0 

0.4 ^1 

0.i 

0.2 

o.t 

OJ 




5 to 15 20 25 
ttUi %ft% 



50 55 40 



1.0 
0.0 
0.0 
0.7 
0.0 
0.5 
0.4 
O.i 
0.2 
0.1 
0.0 



1.0 
0.0 
0.0 
0.7 
0.1 
0.5 
0.4 
O.i 
0.2 
0.1 
0.0 



Core Test - Item 15 




10 15 20 25 
Iflfl Scfff 



50 



55 



40 



Core Test - Item 16 




10 



rryrr 
15 



TTp^r 
20 

Tttfl Sctrt 



wwjrr 
25 



TTJTT 

50 



55 



40 



i7f; 



ERIC 



1.0 
0.1* 
M 
M 
0.1 • 

M 



1.0 

0.1 

0.0- 

0.7 < 

0.1 < 

0.3 

0.4 

o.^ 

0.2 

0.) 

0.0^ 



FIGURE 5 

Proportion Correct: No OTL (square)/OTL (triangle) 
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There are some exceptions to the general finding of common curves for the No 
OTL and OTL categories. For example, items 3, 17, and 39 show a large positive effect of 
having OTL. Several other items with sizeable numbers of students in the two OTL 
categories ako show positive effects. This means that for these items, the added 
advantage of having OTL is not fully explained by a corresponding increase in total 
score. OTL directly affects the success in solving the item correctly. From Table 1 we 
find that for the three items listed, the proportion correct increases strongly when 
moving from the No OTL category to the OTL categories. However, Table 1 cannot be 
counted on for finding items with direct OTL effects of this kind, since several other 
items also show strong increases in proportion correct due to OTL. We will return to 
the interpretation of this type of effect in Section 4. Note also that with the exception of 
item 3 an^ OTL effect appears to be such that the two curves are approximately parallel, 
implying that the OTL effect is constant across achievement levels. For item 3 the OTL 
advantage increases with increasing achievement level, perhaps because it is a difficult 
item. 

4.2 Bivariate responses 

The various descriptive analyses carried out for the univariate responses can be 
carried over to bivariate responses. A common measure for studying relationships 
among dichotomous items is that of the tetrachoric correlation coefficient (Lord & 
Novick, 1968). In line with the previous section, we may study the strength of 
association between each pair of achievement items by computing three sets of 
correlations, using all students, students with No OTL on neither of the pair of 
variables, and students with OTL on both of the pair of variables. For each of the sets, 
the average correlation across all pairs gives an indication of the degree of homogeneity 
of the items in their measurement of achievement. It is of interest to study if this 
homogeneity is affected by OTL. Further, in line with the previous section, the 
corresponding three sets of correlations may be computed conditional on the total test 
score viewed as a general achievement variable. For lack of space these analyses will not 
be presented here, except to note that tHe homogeneity of correlatons does not seem to 
be affected by OTL 

43 Change of univariate responses 

The SIMS core items also provide the opportunity to study changes in proportion 
correct for each item from the Fali testing to the Spring testing. This change can b » 
related to OTL. For each item we may distinguish between three groups of students, 
those who did not have OTL before the pretest or before the posttest (the No OTL group). 
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those who had OTL before the pretest (Prior OTL), and those who did not have OTL 
before tlie pretest but did have OTL before the posttest (This year OTL). The change for 
the No OTL group gives an indication of change due to learning on related topics. The 
change for the Prior OTL group gives an indication of effects related to practice, review, 
and, perhaps, forgetting. The change for the group having This Year OTL reflects the 
direct exposure to the topic represented by the item. These changes can be studied in 
Table 1. Table 1 shows that, where changes occur, they are largely positive for each OTL 
category with the largest changes occurring for students in the category of This Year OTL 
as expected. They may be taken to support the dependability of the teacher-reported OTL 
measure. 

5. Variations in latent trait measurement characteristics 

The study of the imivariate achievement responses in Section 4.1 showed that the 
set of core test items served as good indicators of the total tf^*: score. We may 
hypothesize that this test score is a proxy for a general mathematics achievement 
variable as measured by the combined content of the set of core items. However, the 
total test score is a fallible measure and what we are interested in are the relationships 
between the items ana the true score and estimates of the true scores. This is a situation 
for which Item Response Theory (IRT) has been proposed as a solution used (see for 
example. Lord, 1980). The curves of Figures 1-9 are, in IRT language, empirical item 
characteristic curves, which as theoretical counterparts have conditional probability 
curves describing the probability correct on an item given a lat^*;t trait score. We will 
now describe the IRT model and how it can be extended to take into account 
instructional heterogeneity in 'ts measurement characteristic. 

In formulas the IRT model may be briefly described as follows. Let y* be a p 
vector of continuous latent response variables that correspond to specific skills needed 
to solve each item correctly for item j, 

(1) yj = 0,ify*£tj 

i, otherwise 

where O denotes the incorrect answer, 1 denotes the correct answer, and tj is a threshold 
parameter for item j corresponding to its difficultv. Assume also that the latent response 
variable y*^] is a function of a single continuous latent h and a residual ej^ 

(2) y»j-Ijh + ej. 
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where Ij is a slope parameter for j, interpretable as a factor loading. With proper 
assumptions on the right-hand-side variables, this gives rise to the two-parameter 
normal ogive IRT model. For each item there are two parameters tj and Ij. The 
conditional probability of a correct response on item j is 

(3)P(yj = iIh) = F[(-tj + ljh)q- ^1 

where q is the variance of ej. This means that the threshold tj determines the item's 
difficulty, that is the horizontal location of the probability curve, and the loading Ij 
determines the slope of the probability curve. 

In Section 4.1 we investigate descriptively whether the conditional proportion 
correct given total test score varied across OTL groups. In IRT language this is referred to 
as inygstig^ting item bi^S or using a more neutral term, differential item functioning. 
Standard IRT assumes invariate item functioning across different groups of individuals. 
A variety of bias detection schemes related to IRT have been discussed in the literature. 
Concerns about item bias due to instructional heterogen-ity have recently been raised in 
the educational measurement Uterature. ConfUcting results have been found in 
empirical studies. For example, Mehrens and PhilUps (1986, 1987) found Uttle 
differences in measurement characteristics of standardized tests due io varying curricula 
in schools, while MUler and Linn (1988), using the SIMS data, found large differences 
related to opportunity to learn, although these differences were not always interpretable, 
Muthen (1989) pointed out methodological problems in assessing differential item 
ftinctioning when many items may be biased. He suggested a new approach based on a 
model which extends the standc^d IRT. The analysis is carried out by the LBCOMP 
program (Muthen, 1987). This approach is particulariy suitable to the SIMS data 
situation with its item specific OTL information and it will be briefly reviewixl here. 

Let X be a vector of p OTL variables, one for each achievement ite-i. The x 
variables .nay be continuous, but assume for simplicity that xj is dichotomous with xj = 
0 for No OTL and xj = 1 for OTL. Consider the modification of equation (2) 

(4) y»=lh + Bx + e 

where in general we restrict B to a diagonal p x p matrix. The diagonal element for item 
j is denoted bj. The OTL variables are also seen as influencing the trait h, 

(5) h = g'x + z 

where g is a p-vector of regression parameter slopes and z is a residual. 
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It follows that 

1 

(6)P(yj=iIh,xj) = F[(-tj + bjxj + ljh)V(y*jIh)- 



In effect then, the bj coefficient indicates the added or reduced difficulty in the item due 
to OTL. Equivalently, using equation (4), we may see this effect as increasing y*j, the 
specific skill needed to solve item j. 

We note that this model allows for differential item functioning in terms of 
difficulty but not in terms of the slope related parameter Ij. This is in line with the data 
analysis findings of Section 4.1 where little difference in slopes of the conditional 
proportion correct curves was found across OTL groups (item 3 was an exception; we 
assume that this item will be reasonably well fitted by a varying difficulty model). More 
general modeling is in principle possible, but the data features do not seem to warrant 
such an extra effort. 

This model disentangles the effects of OTL in an interesting way. Equation (5) 
states that OTL has an effect on the general achii^vement trait as measured by the g 
ccoefficients. Here we are interested in finding positive effects of instruction. Through 
the expected increase in h, such effects also have an indirect positive effect on the 
probability of a correct item response. The strength of h*s effect on item j is measured by 
the coefficient Ij; (see equations (4) and (6». In addition to the indirect effect of OTL for 
item j determined by g and ly there is also the possibility of a direct OTL effect on item ], 
which is determined by the bj coefficient; see equations (4) and (6)). Any direct effect 
indicates chat the specific skill needed to solve them, j, draws not only on the gencr'^l 
achievement trait but also on Oi L. The size of the g effect indicates the extent to which 
the achievement rait is sensitive to instruction. The size of the bj effect indicates the 
amount of exy osure sensitivity or instructional "over-sensitivity" in item j. While 
positive g effects correspond to a positive educational outcome, possible bj effects are of 
less educa^onal interest in that they demonstrate effects of teaching that influences very 
narrow content domains. From a text construction point of view items that show such 
exposure sensitivity are less suitable for inclusion in standardized tests, since they are 
prone to '"*"jm bias" in groups of examinees with varying instructional history. If such 
item bias goes undetected, ERT analysis is distorted. In the modeling presented here, 
however, exposure sensitivity is allowed for and the analysis does not suffer from the 
presence of such effects. 

Muthen, Kao, and Burstein (1988) presevits examples of analysis of exposure 
sensitivity using the dichotomous OTL groupings. However, we will first consider an 
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OTlv Prior OIL were used. Figure 10 shows the estimated item characteristic curves for 
?tem 17 having to do with acute angles. Sinre there are three OTL categories, there are 
three curves corresponding to three difficulty values. 

Since the ctirves for both This Year OTL ar Prior OTL are above the No OTL curve, the 
b effects are positive for these two OTL groups. Exposure to the concept of acute angles 
produces a specific skill, which has the same effect as a reduced item difficulty, and this 
skill is not included in the general achievement trait. It is interesting to relate this 
finding to the percer^tage correct on item 17 broken down by OTL group as given in 
Table 1. Percentage correct increases g. magically fi-om the Nk^ OTL category to the OTL 
categories, but the percentage correct is slightly higher for Prior OTL than for This Year 
OTL Foi item 17 the Prior OTL students may do better Uon This Year OTL students, but 
Figure 10 shows that the recency of OTL gives an advantage for students at the same 
achievement trait level. Comparing the estimated item characteristic curves of Figure 
10 witli the ^pirical curves of Figure 5 we find a large degree of similarity but also 
diffrrences. The estimated curves represent m re correct and precise estimates of these 
curves. 

Muth^n, Kao, and Burstein (1988) found substantial exposure sensitivity in items 
3, 16, 17, 38, and 39, corresponding to solving for x, the product of negative integers, 
acute angles, percentages, and the coordinate system (see appendix). While items 3, 17 
and 39 provided ra^er poor measurement ^he achievement trait as indicated by 
their estimated 1 values, that was not the case for the other tVAO. The authors 
hy^ thesized that the exposure sensitivity corresponded to c^^ly learning ot a 
definitional nature. Further analyses of the rotated form items, carried out by Kao 
(1989), supported this hypothesis. For example, the rotated forms showed exposure 
sensitivity for items covering square root problems. Overall, about 15 30% of the i^ems 
exhibit mild exposure sensitivity, while only about 10 • 15% exhibit strong exposure 
sensitivity. We may note that these percentages are considerably lower than the Miller 
and Linn (1988) findings using related parts of the SIMS data and standard IRT 
methodology. The effects of OTL on the achievement trait will be discussed in later 
sections. 

6. Multidimensional modeling 

Standard IRT modeling assumes a unidimensional trait as war. also done m the 
previous section. For a carefully selected set of test items, this is often a good 
approximation. However, in many achievement applications, it is reasonable to assume 
that sets of items draw on more than one achievement trait. 

O 
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Muthai (1978) presented a method for the factor analysis of dichotomoa'^ items, 
where the SKxlel is 

(7) y* = L h + e 

id) V(y*)=*LyL'+Q 

where L is a p x m factor loading matrix, y is a factor covariance matrix, and Q is a 
diagonal matrix of residual variances. In line with item analysis tradition (see Lord & 
Novid;,l%8),Muthenfitted themodd toamatrixof samptetetrachorics^ Foran 
overview of factor analysis with dichotomous items, see N (islevy (1986). 

Alttough of great substantive intoest, models with many minor factors are very 
hard to identify by usual means of analysis. For instance, assume as we will for the 
SIMS data, that a general achievement factor is the dominant foctor in that it influences 
the responses to all items. Assume that, in addition to this general factor there are 
several specific factors, orthogonal to the general factor, that influence small sets of 
items of co'aimon, narrow content It is well known that such models with continuous 
data cannot be easily rerovered by oidinaiy exploratory factor analysis techniques 
involving rotations. This problem cairies over directly to dimensionality analysis of 
dichotomous items tising tetrachoric correlations. 

Consider as an illustration of the problem an artificial model for forty 
dichotomous items. Assume that one general factor influences all items anc^ eight 
specific foctors each inf uence a set of five items. Let the general factor loidings be 0.5 
and 0.6 while the specific factor loadings are 03 and 0.4. F^et thei^orsbestandaidized to 
imit variances and let the factors be uncorrected. The eigenvalues of the corresponding 
artificial correlation matrix are shown in Rgure 11. Such a ^scree plof' is used for 
determining the number of fact03 in an item set The number of factors is taken to 
correspond to the first break point in the plot where the eigenvahies level off. If the first 
eigenvahie is conskleraUy larger than the others and the others are approximately equal, 
this is usually taken as a strong indication of tmidimensionality. Figure 11 clearly 
indfeates unidimensionality despite the existence of the eight specific factors. There 
would be no reason to conskier sohttions of higher dimensionality. 

As a comparison, Bgure 12 shows the eigenvalues for the tetrachork: correlation 
matrix for the 39 core items of the SIMS data. The two eigenvalue plots are rather 
similar. Models similar to the artificial one conskiered above have been studied by 
Srhmid and Leiman (1957), where i; was pointed out that the above hypothesized nine- 
factor model can also be represented as an eight-factor model with correlated factors. 
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FIGURE 11 
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FIGURE 12 

Scree Plot of Latent Roots for 39 Items Bosed on Tetrochorics 
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Each of the eight foctois may be viewed as a function of both a general, second*order 
foctor and the corresponding specific ^or of the nine-foctor model The specific factor 
is then viewed as a residual contribution, orthogoiud to the second order foctor. Hence, 
Schmid and Ueman used the term hierarchical foctor analysis. Using exploratory fector 
analysis on the artificial correlation matrix, an oblique rotation of the eight foctor 
solution did indeed identify the eight correlated foctors of such a hierarchical 
reformulation to the model. Schmid and Ueman (1957) gave formulas for transforming 
such a solution back to the original model with a general factor and eight specific factors, 
all factors being uncorrelated. However, without knowing the correct number of foctors, 
there would have been no guide to choosing this eight-factor solution. 

The usefulness of hierarchical factor analysis has recently been pointed out by 
Gustafsson (1988a, b). He proposed to circumvent the difficulties of using exploratory 
factor analysis by formulating confirmatory factor analysis models. Hypothesizing a 
certain specific factor structure in addition to a general factor, the confirmatory model 
enables the estimation of factors with very narrow content. Applications of this type of 
modeling to the SIMS data are being considered by the author in coUaboration with 
Buistein, Gustafsson, Webb, Kim, Novak, and Short. In line with oiu* previous 
modeling, we may write a simple version of this model as 

(9) y*j = lGjhG + lS|hSj^ + ej 

where y* is the latent response variable for item j (cf. tlie Section 4 model), h^ is the 
general achievement factor, hgj^ is the specific factor for item j, and ej is a residual. The 

three right hand side variables are taken to be tmcorrelated. This means that the items 
belonging to a certain specific factor correlate not only due to the general factor but also 
due to this specific factor. 

In this simplified version of the model, it assumed thai each item measures only 
one specific factor. For identification purposes we assume that each specific factor h3j^ is 

meastu:ed by at least two items. Also for identification purposes, out baseline model 
will set ISj - 1 for all j*s, although this can be relaxed as a need arises as well be discussed 

below. In this way, the general factor is assumed to influence each item to a different 
degree, while the specific factor has the same influence on all items in the corresponding 
set 

The multidimensional confirmatory factor analysis model allows an interesting 
variance component model interpretation. Standardizing the general factor variance to 
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unity, whUe letting the specific factor variances be free parameters, the model implies a 
decomposition of the latent response variable variances into a general factor 
component, a specific factor component, and an error component: 



(10) V(yY = lGj^+Ys^ + qj 

where YSk is the variance of the specific factor k. Since the items are dichotomous, the 
variances of the y*'s are standardized to one by restrictions on the qj's. The relative sizes 
of the first two terms on the right hand side of (10), the general and the specific 
components, ai« of particular interest The specific component can also be interpreted as 
the average correlation remaining between items belonging to specific factor k when 
holding the general factor constant. The model can be estimated by confirmatory factor 
analysis techniques for dichotomous items using tiie USCOMP computer program, see 
Muthen (1978, 1987). 

Ihe SIMS items of the core and the rotated forms werp classified into subsets 
corresponding to specific factors defined both by content and procedure. Examples of the 
narrow item domains that were considered are: Arithmetic with signed numbers (core 
items 3, 16, 25), percent calculations (core items 2, 34, 36, 38), estimauon skills (size, 
distance; core items 6, 8, 9), and angular ntjeasurements (core items 17, 19, 21, 22). 

Tlie analysis steps are as foUows. For a given hypotiiesized set of specific factors, a 
confirmatory factor analysis run can be performed. The initial model may then be 
refined in several steps. An inappropriate combination of items for a specific factor 
gives rise to a low or negative variance component estimate for this specific factor. 
Modifications may be assisted by inspection of model misfit indices. For this model a 
useful index is related to the loadings of tiie specific factors, Is^, which are fixed to unity 

in the baseline model. The sign and size of the derivatives of tiiese loadings are of 
interest. A positive value for a certain item indicates that if the loading is free to be 
estimated, the estimated value wiU be smaUer than one. In effect, tiiis aUows the 
estimate of the variar^ce component for the specific factor at hand to increase. This is 
because the specific variance component is related to the average correlation of the 
specific factor items, conditional on the general factor, where the decrease in the factor 
loading for a certain item means that the conhlbution from this item is weighted down. 
Thus modifying the initial analysis, items that obtain very low or negative specific factor 
loadings are candidates for exclusion fi-om the set assigned to this specific fa-tor. This 
modlficatii n process may be performed in several iterations. In the analyses performed 
for the SIMS data, this procedure appeared to produce substantively meaningful results 
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in that the items that were singled out clearly had features that distinguished them from 
the others in the set. 

Table 2 gives the estimated variance components ^or core items corresponding to 
three of the specific factors. 

Table 2 



Variance G)inponents for Selected Items from the Core* 






Snedfic Factors 


Item 


General 


Percent Estimate 


Angular Measurement 




Factor 




ARa2 


33(24) 


9(9) 




AR34 


39(32) 


9(9) 




AR36 


32(27) 


9(9) 




AR39 


35(26) 


9(9) 




ME06 


2(X14 


9(10) 




ME06 


38(27 


9(10) 




MEB 


38(29) 


9(10) 




GEl? 


28(17) 




11(12) 


GE19 


17(12) 




11(12) 


GE21 


24(17) 




11(12) 


GE22 


43(30 




11(12) 



•Given m parenthesis is the estimate when controlling for mean level heterogeneity. (See section?) 



It is seen that the variance contribution from the specific factors can be as large as 50% of 
that of the general factor and are therefore of great practical significance. This is 
particularly so since the sets of items for a specific factor correspond closely to 
instructional units. Analyses of the rotated forms replicated most of the specific factors 
found for the core. 

The confirmatory foctor analysis procedure described is a cumbersome one 
involving many iterations and numy subjective decisions. An attempt was therefore 
made to find an approach which would involve fewer steps and a more objective 
analysis. It was reasoned that if the influence of the general factor could be removed 
from the item correlations/ the remaining correlations would be due to the specific 
factors alone. Such residual correlations could then be foctor analyzed by regular 
exploratory techniques, at least if nesting of specific factors within each other was 
ignored. Given a proxy for the general factor, the residual correlations could be obtained 
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by bivaiiate probit regressions of aU pairs of items on the proxy, using the USCOMP 
progtam. 

An attempt was first made to approximate the general factor for the posttest core 
items with the posttest total score. However, this produced almost zero residual 
correlations. Instead, the pretest total score was used for the posttest items. An 
exploratory factor analysis of these residual correlations, using an orthogonal rotation by 
Varimax, resulted in eleven factors with eigenvalues greater than one. Int 
interpretation of these factors showed an extraordinary high degree of agreement with 
the specific factors previously obtained. The best agreement was obtained for factors that 
had obtained the largest variance component estimates. The exploratory analysis ako 
suggested a few items to be added to the specific factors as defined earUer. The 
agreement of these two very different approaches is renwrkable and it is interesting that 
the pretest score appears to be a better proxy for the general factor at the posttest occasion 
than the posttest score. This may indicate that the general fector is a relatively stable 
trait related to the achievement level before eighth grade instruction; we note from 
Table 1 that This Year OTL is the most prevalent category. Controlling for posttest score 
may in contrast control for a combination of the general factor and specific factors. 

It is interesting to note that analyses of the core items administered at the pretest 
gave very similar results in terms of specific factors identified by the confirmatory 
approach. This indicates stabiUty of the specific factors over the eighth grade. 
Attempting to compute residual correlations for exploratory factor analysis again gave 
neav zero values when controlUng for the total score, the pretest this case, and this 
approach had to be abandoned. 

7. Modeling with heterogeneity in levels 

The factor analysis of the previous section was performed under the regular 
assumption of identically distributed observations, that is all students are assumed to be 
sampled from the same population with one set of parameters. However, we have 
already nob i that the students have widely varying instructional histories and that the 
homogeneity of student populations is not a realistic assumption. This is a common 
problem in educational data analysis which has been given rather litUe attention. We 
may ask how this heterogeneity affects our analysis and if it can be taken into account in 
om modeling. 

Muthen (1988a) considers covariance struchire modeling in populations with 
heterogeneous mean levels. This research considers both the effect of incorrectly 
ignoring the heterogeneity and proposes a method to build the heterogeniety into the 
model. The method is direcUy applicable to the multidimensional factor analysis model 
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considered in the previous section and can also be carried out within tht USCOMP 
framework. Consider the model of equation (7) 

(11) y* = Lh + e 

In the previous section we made the usual standardization of E (hi) - O for all 
observations i and assumed V (hi) = y. However, we know that it is unrealistic to 
assume that for example students from different class types have the same foctor means 
levels and we may instead want to assume that the means vary with class type such that 
for student i in class c we have E (hie) ac. As pointed out in Muthen (1988a) this may 
be accomplished by consklering in addition to (11) the equations 

(12) hic = Gxc + zic 

where xc represents a vector of class type dummy variable values for class c, G is a 
parameter matrix, and zic is a residual vector for student i in class c. We assume that 
conditional on class type membership the factor means vary while the factor covariance 
matrix remains constant, 

(13) E(hicIxc)=:Gxc 

(14) V(hicIxc) = Y 

The modeling also assiunes that the matrices L and Q ^re constant aaoss class types, so 
that 

(15) E(y*Ixc) = LGxc 

(16) V(y*Ixc) = LVL' + Q 

It is interesting to note that the assumption of constancy of the conditional covariance 
matrix V (y* I Xc) is in line with the findings of coiwtancy of the homogeneity of 
correlations found in Section 4.2. 

The structure imposed on the parameter matrices of (15) and (16) may correspond 
to an exploratory or a confirmatory factor analysis model. Muthen (1988a) points out 
that the conditional covariance matrix of (16) is not in general the same as the marginal 
covariance matrix V (y*). In our context this means that even when we have the same 
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factor analysis structore in the different class types this covariance structure does not 
hold in the total group of students. The approach outUned here essentiaUy provides a 
mean-adjusted analysis of pooled covariance matrices assumed to be equal in the 
population. In our sihiation the analysis effectively is carried oat on pooled tetrachoric 
correlation matrices. This modeling has two important outcn.^^es. The dimensionality 
analysis can be carried out without distortion due to the differences in factor mean 
levels across class types and the factor mean levels can be estimatKl. 

The above mean-adjusted analysis was carried out on the SIMS core itei. j using 
the multidimensional factor model from Table 2 of the previous section. Factor mean 
differences were aUowed for class type using three dummy variables and also gender. 
We will concentrate our discussion of the results on the factor structure. Despite large 
mean differences across class type for the general achievement factor, a factor structure 
very similar to the previous one emerged. The same specific factors showed large and 
smali variances, respectively. Hence, the potential for a distorted structure is not 
realized in these data. The results are presented in parentheses in Table 2. It is seen that 
the variance contributions to the general factor are considerably reduced as compared to 
the first approach. 

The reduction in variance contribution from the general factor is natural since 
holding class type constant reduces the individual di' jrences in the general 
achievement trait due to selection of students. If the inference is to the mix of studei ts 
encountered in the SIMS data the unreduced variation in the trait is the correct one, but 
this variation is not representative for a student from any given class type. It is also 
interesting to note that the specific factor variances are not similarly reduced by holding 
class type constant, presumably indicating that these specific skills are largely unrelated 
to the student differences represented by class type. 

8. Estimation of trait scores 

Sections 5, 6, and 7 have considered various factor analysis models for the 
achievement responses. Assuming known or well-estimated parameter values for these 
models it is of interest to estimate each student's score on factors of these models. For 
the standard, unidimensional IRT i^todel estimation of the trait values is a standard task 
which may be carried out by maximum likelihood, Bayes' modal (maximum a 
posteriori), or expected a posteriori estimator (see for example Bock k Mislevy, 1986). 
The instructionally sensitive models we have considered for the SIMS data have 
however brought us outside this standard sihiation in the following three respects: 
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(i) In line with SecUon 5 we want to consider factor score estimation that takes 
into account that certain items have different difficulty level depending on the students* 
OTL level. 

(ii) In line with Section 6 we want to consider factor scores for both the general 
achievement feictor and the specific foctors in the multidimensional model 

(iii) In line with Section 7 we want to consider feictor scores estimation that takes 
into account differences in student achievement level 

We note that (i) and (iii) are quite controversial since these points raise the issue 
of estimating achievement scores based not only on the student*s test responses but also 
his/her instructional background* For example Bock (1972) has argued that prior 
information on groups should not be used in comparisons of individuals across groups. 
Nevertheless, it would seem that students who have had very limited O on a set of 
test items will be xmfoirly disadvantaged in cor^iparison with students with different 
instructional exposure* The aim may instead be to obtain achievement scores for given 
instructional exp)eriences« 

Point (ii) is of consklerable interest* While a rough proxy for the general 
achievement score is easily obtainable as the total test score, the adding of items 
corresponding to specific factors would involve only a few items resulting in a very 
luueliable score* As a contrast, estimating the specific factor scores draws on the 
correlated responses from all other items* 

The following estimation procedure was discussed in Muthen and Short (1988) 
and handles all three cases above* For various density and probability functions, g, 
consider the posteriori distribution of the factors of h, 

(14) g(hly,x) = f(hlx)g(ylh,x)/g(ylx) 

Here, the first term on the right hand side represents a normal prior distribution 
for h conditional on x, where as before x represents instructional background variables 
such as OTL and dass type* In line with Section 7 the factor covariance matrix may be 
taken as constant given x, while the foctor means may vary with x* The second term on 
the right ^?jnd side represents the product of the item characteristic curv.%, which may 
vary in difficulty across OTL levels as discv^aed in Section 5* 

Muthen and Short (1988) cons dv^red an example of the situation of (i) and (iii). 
They generated a random sample of IaXX) observations from a model with forty items 
measuring a unidimensional trait* Observations v/ere also generated from forty OTL 
variables and five other background variables* All background variables were assumed 
to influence the trait while the first twenty OTL variables had direct effects on their 
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corresponding items, giving rise to exposure sensitivity in these items. Among other 
results, MuthOT and Short considered differences in factor score estimates using the 
above method and the traditional KT method. In Table 3 comparisons of the two 
coiresponding score distributions are presented by quartiles, broken down in two parts - 
students with a high total sum of OTL and students with a low sum. The table 
demonstrates that for students of the Ic v OTL group, estimated scores are on the whole 
higher with the nfc*v method, corresponding to an adjustment for having had less 
exposure, while for the high OTL group the estimated scores are on the whole lower for 
the new method. 

Ongoing work by Muthen and Short investigates situation (u) and the precision 
with which scores for spedfk: fectors can be estimated. Once the estimated factor scores 
have been calculated they may conveniently be related to various instructional variables 
and may also be studied for change from pretest to posttest. 

Table 3 

Trait Estimates by Traditional and New Approaches* 
LOW OTL GROUP 



NEW 



TRADITIONAL 

2S3Sl m. 



75% 



100% 



TOTA T 



25% 


136 

-.1.323 
-1.255 


6 

-0.610 
-0.724 


0 


0 


142 

-1.293 
-1.233 


50% 


10 
-0.783 
•0.624 


125 
-0.361 
-0.338 


5 

0.037 
-0.119 


0 


140 

-0.375 
-0.351 


75% 


0 


13 
-0.094 

0.058 


111 

0.309 
0.316 


7 

0.827 
0.691 


131 
0.297 
OJll 


100% 


0 


0 


6 

0.691 
0.834 


124 
1.282 
1.308 


130 
1.255 
U86 


TOTAL 


146 

-1.286 
-1212 


144 
-0347 
-0^8 


122 
0317 
0324 


131 
1.257 
1275 


543 
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Table 3 (Continued) 
Tndt Estunate$ by Traditional and New Approaches* 



HIGH OIL GROUP 





TRADITIONAL 








NEW 






75% 




TOTAL 










n 

V 




25% 




A C70 






-1.245 




-1.349 


-0.743 






•1.298 




e 


94 


12 


0 


111 


50% 


-0.72O 




0.049 




-0.315 






-0o66 


-0.119 




-0.349 




0 


3 


110 


5 


113 






-0.167 


03^3 


0.870 


0355 






-0.222 


0J22 


0.640 


0327 




0 


0 


6 


114 


120 


100% 






0.653 


1386 


1349 








07L82 


1334 




TOTAL 


104 


106 


128 


119 


457 




-1.278 


-035S 


0.332 


1*364 






-1.312 


-0389 


0.302 


1.305 





^Entries are 
Frequency 

mean value by the traditional approach 
mean value by the new approach 



9. Predicting achievement 

Given the exploratioiu of the previous sections, we may attempt to formulate a 
more comprehensive model for the data. Mnth& (1988b) proposed the use of structural 
equation modeUng for this task. He discussed a model which octends ordinary 
structural modeling to dichotomous response variables while at the same time 
extending ordinary IRT to inchide predictoii of the trai^ He studied part of the SIMS 
data using a model which attempted to predict a unidimensional algebra trait at the 
posttest occasion using a set of instructional and student background variables from tht 
pretest The set of predictors used ar^ their standardized effects are given in Table 4. 
While pretest scores have strong expected, effects, class type, being female, father being in 
the high occupational category, and finding mathematics ^iseful to future needs also had 
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strong effects. The OTL variables had very small effects overall, perhaps due to the fact 
that each item's OTL variable has rather little power in predicting this general trait. 



Table 4 

Structural Parameters with the latent Construct as Dependent Variable 



Repressor 


Estimate 


Estimate/S.E 


PREALG 


0.68 


7 


PREMEAS 


0.45 


PREGEOM 


033 


/ 
5 


PREARITH 


2.09 


14% 
to 

1 
I 

A 


FAED 


0.07 


MOED 


0.02 


MC»ED 


0.18 


U 


USEFUL 


0.45 


o 
7 


ATTRACT 


0.04 


/ 
1 

A 


NONWHITE 


-0.02 


REMEDIAL 


0.07 


V 
1 


ENRICHED 


0.22 


I 

3 


AI f^VBD A 
ALVjSDKA 


0.56 


4 


FEMALE 


0.14 




LOWOCC 


0.02 


1 


HIGHCXrC 


0.12 


3 


Misscx:c 


0.05 


2 


NONWXREM 


0.10 


1 


NONWXENR 


0.19 




NONWXALG 


-0.18 


3 
-1 


PREARTTHXREM 


-1.45 


-3 


PREARTTHXENR 


-0.10 


-1 


PREARmiXALG 


-0.54 


-2 


NONW X PREARTIH 


-0.19 


-1 



Given the analysis results c* the previous sections, this modeling approach can 
be extended to include a muWdimer^ional model for both the set of pretest and posttest 
iteais, predicting posttest facte .« from pretest factors, using instructional and student 
background variables as covariates, and aUowing for differential item functioning in 
terms of exposure sensitivity. This work is in progress. 

10 Analyzing change 

The structural modeling discussed in the previous section is also suitable for 
modeling of change from pretest to posttest In Section 4.3 we pointed out that in terms 
of change the SIMS data again exemplified complex population heterogeneity. For each 
item a student may belong to either of three OTL groups, correspondinf^ to two types of 
no new learning and learning during the year. To again reach the goal of instructtonaUy 
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sensitive psychometrics as stated in Section 3 for this new situation, we should explicitly 
model this heterogeneity. However, to propeily model such complex heterogeneity is a 
very challenging task and this work has merely begun. 

A basic assumption is that change is different for groiAps of students of different 
class types and OTL patterns. In a structural model where posttest factors are regressed 
on pretest factors the slopes may be viewed as varying across such student groups, where 
students groups for which a large degree of learning during the year has taken place, as 
measured by the set of OTL variables, are assumed to have steeper slopes than the other 
students. This methods area shows a very large degree of scarcity of psychometric work. 
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NON-COGNITIVE DATA 
A Cross-iutional Perspective 



W Todd Rogeis 
University of British Columbia 

I welcome this opportunity to share with you my experience when analyzing the 
non-cognitive attitudinal data collected as part of SIMS. The results of that analysis may 
be fomid in the attached paper and from which I will speak: 

Rogers, W. T., & O'Shea, T. (1985). A comparative analysis of attitudes toward 
mathematics of senior high school students in British Coltunbia, Ontario, and 
the United State ). Canadian and International Education/Education 
Canadienne at Inter nationale. 
14,39-58. 

After summarizing tliis study and its results, I would like to make some 
additional remarks concerning my view of secondary analysis and what can be done to 
encourage such analyses. 

Secondary Analyses of Previously Collected Data: Some Comm ents 

Studies such as SIMS, The National Assessment of Educational Progress, and 
state and provincial (in the case of Canada) assessments offer a rich source of data for 
primary and secondary data analyses. This is particularly so in light of the periodic 
replication or repetition of these studies. 

Invariably, the first focus of these studies is to anrwer the questions initially 
posed when seeking funding support. But, in recognition of the massive data sets 
required to answer these primary questions, and the relative limited amount of 
available funds, the principal investigators often include in their rationale for seeking 
support for the sHidy that the data will be made available to other researchers for 
secondary analyses. The supposition here is that the data collected are amenable to 
addressing questions other than those included in the primar}* set and, therefore, costs 
can be amortized across a wider base or set of studies. In my opimon, this is appropriate. 
Indeed, I advocated such an approach when assisting the Ministry of Education in 
British Columbia establish its provincial assessment program. 
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The paper I am discussing can be classified as a secondary study of the data from 
the SIMS. Neither Tom O'Shea or I were involved in the creation of the data collection 
instruments. I did have some early involvement in the sample design for the province. 
Therefore, I think we can be classified as people who accepted the invitation to look at 
the lEA data firom the Second Study. Our research was supported by the Social Sciences 
and Humanities Research Council of Canada as part of a larger research project (under 
tlie direction of Dr. Davki RobitaUie) designed to complete further analytic of the lEA 
data beyond the analyses included in the initial lEA proposal. 

What was our experience? First I would like to thank the SINdS project directora 
for making available the data we used. Inirther, as questions arose concerning the 
development and vaUdatirn of the attitude scales or the nature of the data set, 
particularly with respect sampling weights, they were graciously and quickly 
answered. Such initial and continued cooperation is crudai to the success of a secondary 
study. I therefore reconrunend that: 

(i) support for such coopciation be included in the initial funds provided for the 
initial or primary study. 

One problem which we had, and which we felt we could do something about, 
was how to treat missing data. To our surprise, the file we received had not yet been 
edited for missing data. Our concern was how to treat such data so that our treatment 
was consistent with that employed by the lEA in its own analyses, and in other analyses 
of the lEA data. In our study, students who omitted more than three quarters of the 
items of a score were removed fi-om the file; missing data on individual items for an 
indiviaual student were assigned the mid-point value of thre.?. But is this what others 
would have done? A third and more difficult problem to solve centered on matching 
student class data with teacher data in schook fi-om which more than one class was 
selected. This problem likely arose during data collection, when the data were initially 
collected, or in data entry, when the data were transferred to the computer data file. 
What ever the source, it was a problem which we were not able to solve. Consequently, 
we removed unmatched classes and teachers. To ensure comparability of data sets acroJs 
different analyses of the same data set, I recommend that 

(ii) data files be edited by the primary investigators to their (the data) release to 
those Wishing to do additional analyses of tiie data, and the editing procedure used be 
clearly described in supporting documentation. 

A related reconunendation is that 

(iii) an intermediate data file containing descriptive statistics be provided by the 
primary investigators to act as a check against which the secondary researchers may 
verify that they have correctiy accessed the data file. 
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The last problem with which we tussled revolved around the measurement 
scales* Qearly, the scales employed by SIMS were developed pilor to our involvement 
Thus, we had to use what was available. As might be expected, we would have liked to 
have made some changes. However, unlike the concerns above, we were not able to. 
Indeed, fundamental to answering a research question is the validity of the data 
collection instrument-tests, survey forms, interview schedules-in terms of that 
question. If a researcher peireives that the instruments used and, hence, the data 
collected are not appropriate for the research questions posed, then it is tmlikely that 
he/she will request a copy of the data set. I deliberately used the work perceived, for 
very often the development and validation of the instruments used in the primary 
study are not adequately described for others to gain a full imderstanding of these 
instruments and their use. We foimd it necessary, for example, to contact the lEA 
offidab on more than one occasion for information beyond that contained in the 
documentation provided. It is therefore recommended that 

(iv) complete documentation in a form similar to that called for in the Standards 
for Educati onal and Psychological Tests be provided by the primary investigators. 

Consideration of the above issues together reveals that numy of the concerns are 
traceable to documentation. Recognition must be given to the needs of a secondary 
researcher so that he/she comes to feel ownership of the data provided in much the 
same way as ownership is felt when a researcher collects his/her own data. No small 
feat, provision should be made when seeking support for the initial study to include 
prcper and full documentation of all elements of the primary research for use in a 
secondary axuilysis of that data. 
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A Comparative Analysis of Attitudes Toward 
Mathematics of Senior High School Students in 
British Columbia, Ontario, and the United States 



Todd W. Rogers and Thomas O'Shea 



^e International Association for the Evaluation of Educational Achievement 
(IBA) recently conducted its second study of mathematics. The portion of the 
study reported here deal with the attitudes towaid matiiematics of Grade 12 
students in British Cohunbia, Grade 12 and 13 students in Ontario, and Grade 12 
students in the U5.A., some of whom were enroUed in Calculus courses. 

Students responded to Ukert-type items making up tiie foUowing scales: 
Matiiematics in School, Calculators and Computers, Home Support for 
Matoematics, Mathematics aud Me. Matiiematics and UtiUty, Mathematics and 
tender, and Matiiematics as a Process. For each scale a 5 x k ("countnr"-by-item k 
the number of items) median polish was used to analyze tiie unique and \o!nt 
effecte of country and item. Results are reported in three areas: students' opinion 
n« 21! curriculum, personal perceptions of matiiematics, and views 

on the discipline of mathematics. 

fa general, students reported remarkably similar views on aU scales. We attribute 
this to the pervasive influence of American educational tiieory and practice and 
to tiie shuctured nature of tiie mathematics curriculum. Students also appeared 
to value matiiematics for its practicaUty ratiier than for its intrinsic worth 

Introduction 

In keeping witii tiie basic design of tiie comparative assessments conducted by the 
International Association for the Evaluation of Educational Achievement (lEA), the 
Second lEA Study of Matiiematics included an assessment of student opinions, 
preferences, and attitudes toward a number of aspects of matiiematics and mathematics 
education. lEA is an association of educational research organizations and ministries of 
education whose primary goals are to conduct educational research on an international 
level and to assist member-states in undertaking cooperative research projects. lEA has 
conducted international surveys L the past, including tiie First Mathematics Study 
(Husen, 1967) and tiie Six Subject Survey (Peaker, 1975; Walker, 1976). In tiie Second 
Study of Matiiematics tiie attitudinal topics ranged from tiie nature of matiiematics and 
its role in society to specifics of tiie matiiematics curriculum. As weU, attitudes of 
teachers toward matiiematics as a process were assessed. 

Justification for assessing attitudes comes not only from tiie tradition of lEA 
studies, but also from the importance attached to affective variables in research and in 
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the assessment of outcomes and processes of schooling. There continues the search to 
find predictors from the affective domain, as well as from among personalogical and 
process variables/ to increase the accuracy of statistical modeb to explain and predict 
variation in mathematics achievei..ent. Alternatively/ affective measures are thought to 
reflect outcomes of schooling. Affective variables/ then/ become outcomes to be 
explained or predicted rather than variables to be used to explain or predict. 

More relevant to the present study/ however/ is the use of affective measures to 
assess how students perceive and respond to what is actually happening in schools. 
Student responses to the affective items included in this survey reflect from the point of 
view of the learner what is occurring in the mathematics classroom. Results on the 
survey items not only are indicative of what happens in classrooms/ but also reflect 
prevailing opinions about mathematics in broader social contexts. Except for the 
influence of school/ there would l>e little reason for a student to have formed an opinion 
about the importance of a particular mathematics concept or about the usefulness of 
learning that concept. 

Of particular interest in thi'> paper arc the differences and similarities between the 
opinions/ preferences, axui attitudes of students from five **countries"-Grade 12 in 
British Columbia (B.C.) Canada; Grade 12 and Grade 13 in Ontario (Ont.X Canada; and 
Pre-Calculus and Calculus in the Uiuted States (U.S.). Factors which suggest that the 
responses would be r^ore sinUlar than different include the close proximity of Caxuid. 
and the United States, the pervasive influence of American educational theory and 
practice (Andrews k RogerS/ 1982)/ and the structured nature of the mathematics 
curriculum. Similaritie lould be partiojlarly evident for students at comparable levels 
of education: Graide 12 in B.C./ Grade 12 in Ont./ and Pre-Calculus in the U.S.; a'^d Grade 
13 in Ont. and Calculus in the US. On the other hand/ Canada's commitment to 
maintahJng and encouraging its own identity/ particularly in Ontario where all 
textbooks must be authored by Canadians/ may result in differences in opinions and 
attitudes. 

In addition to the student responses/ the views of teachers concerning the nature 
of mathematics as a discipline were identified. Teachers also responded to foxu* items 
related to the second mathematics curriculum. Tbvee of these were involved in the set 
of 15 presented to the students. The absence of a rationale supporting the selection of 
these three items and the use of only foiu* raises serious questions about the 
comprehensiveness of coverage. For this reason/ the teacher responses to these items 
were not 'nduded. 
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Structure of the Items and Scales 
Because there existed no apparent consensus in mathematics educaHon of what 
should be measured in the affective domain, the International Mathematics Committee 
(IMC) proposed the foUowing four guidelines for constructing attitude items: 

• Items should address issues of importance to mathematics educators, 

• Responses to items should provide useful descriptive information, 

• Items should permit the fonnation of scales, and 

• There should be items from the first lEA study of mathematics (Kifer, 1979)- 

Based on these guidelines, and foUowing general discussions involving IMC 
members and representatives of the lEA General Assembly, seven general domains 
were identified. Table 1 contains a short description of each domain and the final 
number of items in each. A copy of the final form of each scale is provided in the 
Appendbc of this paper. 

Briefly, initial items were selected from the first lEA mathematics survey, the 
National Assessment of Educational Progress in the U.S., and from other existing 
mathematics attitude scales. New items were written to provide adequate size pools for 
each domain. Although responses to the items were structured differently depending on 
the scale, a common five-point Likert format was adopted. An example of an item taken 
from Mathematics in School and iUustrative of the differences in structure is as foUows: 

Solving Equations 

a) Important - Not important 

Not Not at all 

Important Important Undecided Important Important 

b) Easy- Hani 

Very 

Easy Undecided Haid Hard 
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C) 



Like<DisUke 
like 
a Lot 



Uke 



Undedded 



Dislike 



DisUke 
a Lot 



'Cheae item pools were pilot tested in the VS., and the results were ustd to select 
appropriate items and to form scales. The scales wero then fiekl tested in international 
trials to evaluate the extent to which items translated well, were acceptable to the 
participating coiattries, and possessed desirable psychometric propeides. The content 
vaUdity of the scales was reported to be satisfoctoiy by mathematics eaucators in the 
participating countries. Estimates of internal consistency derived from the field testing, 
hcwever, revealed that analyses of the item level, and not at the scale level, would be 
appropriate for Calculators and Computers and Mathematics as a Process (Kifer, 1979, 
personal communication). 



Since the interest in the Second lEA Study of Mathematics "vas focussed on 
teaching and learning at the class level probability samples of classes were selected from 
each population of interest A basic sample design was recommended, but countries 
were permitted to make approved modifications. Table 2 stunmarizes the stratification 
and selection procedures employed for each population. As shown, each sample may be 
described as a deeply stratified, mulfi-stage probability sample. 

The overall response rates at the class levels were 90% for B.C. and 86% for Ont. 
For the US., school districts, sdtools, and classes were oversampled to allow for refusals. 
The cooperation rates at each stage were approximately 50%, 75%, and 90%. Despite these 
lower values, the desired sample sizes wa-e achieved (Garden, 1985). Furthermore, the 
obtained samples were comparable to other US. samples (Garden, 1985, Appendbc 3). 



To facilitate examination of the relationship between student and teachers 
responses on the Mathematics as a Process items, the item data files were first edited to 
remove respondents for whom data were missing on entire scales, and then to remove 
unmatched classes and teachers. L^dividual items containing missing data were assigned 
the mid-point value three (undedded). At the same time, the polarity of negatively 



Samples 



Data Analysis 
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Table 1 

Attitude Items and Scales 





Items 


Domain 


Description 
of Items 


No of 
Scales 


1. 


Mathematics 
in School (a) 


Attitudes toward mathematical topics and 
activities believed to be univenally 
part of mathematics auricula. Three 
dimensions were considered: importance, 
difficulty, lildng. 


15 


1 


Calculators 
ana v.omputeis 


View of the nahue and usefulness of hand 
calculators and computers. 


8 


3. 


Home Support 
for Mathematics 


Parental ability and support for the 
study of mathematics. 


9 


4 


Mathematics 
and Me 


Personal reaction to the study of 
mathematics in terms of feelings, 
enjoyment competence, arodety, and 
willingness to study more mathematics. 


15 


5. 


Mathematics 
and Utility 


View of the practical value of 
mathematics in preparing for an 
occupation and in everyday life. 


8 


6. 


Mathematics 
and Gender 


Views toward sex differences in 
mathematical ability and the need to 
know mathematics for career purposes. 


4 


7. 


Mathematics as 
aPn)cess(b) 


View of the nature of mathematics as a 
iliscipUne.. 40 a set of rules or a 
field where creativity, speculation, 
conjecture, and heuristics are 
important; as a field with fixed or 
changing content. 


15 



(a) The first three Uems of this scale, and an additional item, were presented to 
teachers. These items weie not induded in tiie present study due to th<? 
questionable selection and comprehensiveness of these foa' items. 

(b) Presented to both students and teachers. 
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Table 2 



Stratification and Sdection Procedures 
for the Sample Oasaea 







Grade 12 


Gndcal2413 


Pre<alc. Ic Calc. 




EC 


Ont 


US. 




Pbpul'iion 


Grade 12 atudenta 


1. Grade 12 atudenta 


Studenta in public and 


Definition 


in public achoola 


in ptMic and private 


private achoola enrolled 




enrolled in Algebra 


achoola enroQed in 


in 4th year mathematics 




11 


Grade 12 Mathe- 


coursea with pnraqui* 






matica 


sites of three yeara of 








secondary levd mathe- 






2. Grade 13 atudenta 


matica (Algtbn and 






inpubUcandpri- 


Geometry). 






vateichoola 








enrolled in at 








laeittwoof 








Rdationa, 








Calculua,and 








Algabra 






Studenta in private 


Studenta in f pedal 






•dioola%vere 


ichoob for fbrdgnert 






excluded (appr:ix* 


and admto with no 






3% of Grade 12 


ft)oed timetable were 






population). 


occluded. 




Stratifica* 


Geographic regions 


Geographic region. 


Public/PHvate 


tion 


r^gidarljruaedby 


size of community. 


regional standard 




Kliniatry of Education, 


puMic/privati^ 


metropolitan 




and adiool size. 


Engliah/Rrcnch 


statistical area. 






ration of Grade 13 


status code. 






and Grade 12. 




Selection of 


A) Proportional 




Proportional alloca* 




allocation of 




tion of classes to 




dawea to atnita 




strata. 




b) Allocation of 




Allocation of aample 




wnple to achcjla 




to sdiool districts 




categorized by 




categorized by size. 




aim 






c) Allocated number 


Rve tchoola selected 


Two schools propor* 




ofadmla 


proportional to number 


tional to size. 




iclected ptopor* 


of Grade 13a. 






tionaltoaize. 
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d)Ont<Nvin«few 
Umtfmdomly 
samplwsdiooU. 



e)AUMudcnl«in 



OneCradcUdaM 
plus one daM from 
wch ot SriBtioM, 
A]f*ra, nndomfy 
•dectiid ftom sampled 



All ttudenlt in 
selected classes. 



Two cl a s se s randomly 
•dected from sampiod 
schools. (Private dass 
sample MatfcMuUy in tvro 
stagaK schools sdeded 
proportional to sin and 
two cla sses randomly 
selected from sampled 
schools). 

AU students in selected 
classes. 



Souiee: Garden. R. A, (1985). 



worded items was revened. The result was a consistent matched student within class- 
teacher file. The final sample sizes for each country are summarized in Table 3. 

tatgnal C2asi2SSicy. Before proceeding to the statistical analyses conducted to 
investigate the research hypotheses, item analysis (Nelson, 1974) were performed for 
each scale and sample The results are repotted in Table 4. 

Examination o£ these data reveals that the properties of »he items ai<l scales are 
quite comp^ble among the Hve student samples and for the teacher sample on the one 
scale Furthermore, these results are similar to the results from the international traib 
(see KUier, 1979), and analysis only at the item level is warranted for Calculators and 
Computers and Kfathematics as a R^ess. ConsequenUy, given the desirability of a 
unifonn approach to analyses across scales, "item" was included as a factor along with 
country. 

ilnitflf Analjaift. Related to the previous decision was the question of which unit 
of analysis-student's class-should be used to examine the country and item factors. 
Much controversy surrounds this issue (see, for example, Hopkins, 1982). 

In line with the approach taken by Kifer CTraveis, to appear) item scores were 
aggregated the dass level It was felt that, because testing took place at the end of the 
school yew:, <he assumption of independence among students within class v-as difficult 
to justify. On the other hand, the r-sumption of independence between classes and, 
particularly between teM:hers within schools, was held to be tenable Initial examination 
of between class and between teacher differences in schools where more than one dass- 
teacher was assessed revealed distinct differences. As a consequence. 
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Table 3 



FiMl Student, Class, and Teacher Sample Sizes 





Nuuiber of Number 'rf 






Class Size 


Teachers Country 


Studenb 


Oaases 


Mean 


sxL 


Range 


GrKlel2 B.C 


1943 


95 


705 


6.8 


3-35 


Grade 12 Ont. 


1236 


55 


223 


5.6 


9-37 


PreOdcuIus U.S. 


3891 


207 


las 


7.4 


3-45 


Grade 13 Ont. 


3143 


175 


lao 


6.9 


1-32 


Calculus UJ. 


731 


43 


17.0 


63 


3-30 



Table 4 

Mean, Standard Deviations, internal Consistencies, and 
Range of Item-Scale Correlations 



Internal 
Con.& 
Ranges 

No of. Item Ite: Item-Scale 



Mathematics 


15 


12B.C 


3.65 


C.45 


.79 


.03-.58 


in School 




12 Ont 


3.59 


0.52 


.83 


.02-.61 


Importance 




Pre-C.US. 


3.75 


0.49 


.82 


.14-39 




13 Ont 


3.77 


0.51 


J81 


.10-.60 






CalcUr. 


3.87 


0.45 


JdO 


.13-.60 


Difficulty 


15 


12B.C 


335 


0.45 


.80 


.24-34 






12 Ont 


333 


0.48 


.81 


32-.54 






Pre<:.US. 


3.41 


0.45 


.79 


.29-31 






13 Ont 


3.44 


0.50 


.82 


.28-.55 






CalcUS. 


335 


0.49 


.81 


.29-.54 


Liking 


15 


12B.C 


3.09 


0.47 


.78 


.03-.50 




12 Ont 


3.02 


0.51 


.51 


.12-31 






Pie-C.US. 


3.Q9 


0.49 


.78 


.14-.47 






13 Ont 


3.14 


0.51 


.78 


.08-32 






Calc.U5. 


3.13 


0.49 


.77 


.12-.51 
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1 Calculatois 8 
and Computers 



3. Home Support 9 
for 

Mathematics 



6. 



7. 



Mathematics 
for Me 



Mathematics 
and Utility 



Mathematics 
and Gender 



^4athematics 
as a Process 



Mathematics 
asa Process 



19 



15 



15 



{a)Hoyt(1941) 



12B.C 3.53 


0.50 


.62 


.00-.47 




n AS 
U.49 


.48 


-.11-36 


Pre-C.U5. 3.64 


0.46 


53 


-.02-39 


130nt 335 


0.48 


.54 


.00-.42 


Cak. U5. 3.65 


0.44 


31 


.04-.40 


12B.C 334 


0.64 


.76 


31-.53 




0.63 


.73 


31-35 


Pre<:.US. 3.54 


0.60 


.73 


34-.56 


130nt 333 


0.62 


.74 


.26-.55 


CalcUS. 332 


0.58 


.71 


.26-58 


12B.C 3S7 


.087 


JS7 


.01-.72 


t ^ v/Iii. 




/Vf 

.91 


.77:77 


Pre<:.US. 3.69 


0.56 


.89 


.21-.72 


130nt 3.62 


0.59 


.90 


.27-.71 


Cafc. US. 3.79 


0.53 


.88 


.24-.74 


12B.C 3.46 


0.59 


.78 


.40-.56 


v/nc *3.vX^ 


U.63 


.78 
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.87 
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.49-.69 
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.83 


.52-.74 


130nt 3.83 
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58 
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57 
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57 
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0.37 


.61 
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CalcUS. 335 


0.37 


.63 


.05-.44 
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0.41 


.74 
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0.36 


.65 


.08-.53 
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.036 


.63 


0.02-.43 
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Students were aggregated to the class level yielding a class-by-item nuitrix for each scole 
andi country. 

The class median was used as the aggregated class-item score. Resistant to 
outliers, it was felt that tk? median would provide a more valid measure of class 
performance than the more typically used mean. 

Statistical analyst. A 5 x k (country-by-item, k the number of items) median 
polish CTukey, 1977; Welleman & Hoaglin, 1981) was used to aiudyze the unique and 
joint effects of country and item. Similar to analysis of variance, the median polish is 
based upon the additive model, but fits the model by finding row and coltunn medians 
and by using iteration to obtain a final solution. Row effects indicate the extent to which 
countries responded more or less positively to the k items in a scale. Coltunn effects 
correspond to item effects and indicate which items were responded to more or less 
fevourably. Rnally, cell entries contain th :siduals. Countries or items which feiil to 
follow a general pattern established by other countries or items will produce residuals. 
These represeiit unique patterns of response by students in a particular coimtry to 
particular items* 

The median polish was completed .'separately within each scale using the 
computer program Minitab (Ryan, Joiner, <k Ryan, 1982) with two complete iterations. 

As is the case for other explcratr^ry data analysis techniques, the median polish 
does not have an accompanying statistical hypothesis-testing procedure to identify 
significant effects. Instead, Tukey (1977) recommends the use of judgment, taking into 
accotmt the nature of the distribution of effects. Examination of the distributions in the 
presait study revealed that many of the effects were either equal to zero or close to zero. 
Application of a rule of thumb based on hinges and multiples of the H-spread suggested 
by Tukey (1977, p. 383) led to inconsistent findings across item effects and country-by 
item effects. In some instances the largest item effect was within the upper and lower 
hinges for items, while country-by-item effects of smaller magnitude were outside these 
hinges of residuals. To try to clarify the situation, a 5 x k fixed effects analysis of variance 
was performed in which "item** was treated as a repeated measures factor. The effect 
sizes yielded by this analysis were very similar in magnitude to those produced by the 
median polish. And, as with Tukey*s rule, inconsistent findings were observed: small, 
near zero effects were significant (p. 01, Greenhouse-Geisser and Hunyh-Feldt 
probabilities (Kirk, i>82, pp. 259-262) in the case of students, but not for teachers. The use 
of class medians rather than individual student scores accounts for this increase in 
powen 

Therefore, favouring a uniform procedure, the following rule was used: effects 
less than 0.20 in absolute value were considered non-significant and equal to zero. This 
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cut-off corresponded in most instances to a natural breakpoint in the distributions of 
median polish eitfects, reflected fairly the skewness of these distributions where it 
existed, and represented a difference of 10 percent, the minimum considered necessary 
for interpretation. 

To help clarify the presentation and discussion of results, the scales have been 
divided into three groups. The first group deals wi»h the mathematics curriculum and 
contains the Mathematics in School scales and the Calculators and Computers scale. The 
second group centres on personal perceptions of mathematics, and included the Home 
Support, Mathematics and Me, Mathematics and UtiUty, and Mathematics and Gender 
scales. LasUy, the third group concentrates on views of the disdphne of mathematics and 
contains the Mathematks as a Process scale. 
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Student Perceptions of the Mathematics Curriculum 
Results on the Calculators and Computers items indicated that, generaUy, Grade 
12 Students in B.C., Grade 12 and 13 students in Ont., and Pre-Calculus and Calculus 
students in the U.S. were positive about the efficacy and benefits of calculators and 
computers (median item score = 3.60). They considered the 15 topics and activities in the 
Mathematics in School scale to be somewhat important (m = 3.58). They were, however, 
undecided about the difficulty of many of the topics and activities (m = 3.03)- 

These overall findings held up aCToss the five coimtries. As might have been 
predicted fi-om consideration of the means listed in Table 4, the country effects produced 
by median polishing the corresponding country-by-item matrices were aU less th-r. 0.20 
in absolute value. 

Examination of the country-by-item effects revealed essentially the same finding. 
Of the 265 residuals, only 33 exceeded the significance criterion, and of these, only 16 
could be reasonably explained. Not unexpectedly. Grade 13 students in Ont., and to a 
greater extent. Calculus students in the US. indicated that differentiating and integrating 
hmctions Gtems 10, 13, Mathematics in School) were activities which were more 
important, relatively easy, and best liked. The US. Calculus students also considered 
drawing graphs of functions (11) and finding a limit of a fimction (12), two related 
activities, to be more important. They also indicated proving theorems (6) was 
somewhat less important and they tended to dislike this topic. Tliis set of findings is 
attributable to the differences in curriculum between the rountries. Grades 12 students 
in B.Cand Ont. and Pre<alculus students in the US. are typicaUy not exposed to calculus. 
Shidents in Grade 13, Ont. generally are exposed to a variety of advanced topics 
(trigonometry, geometry, advanced algebra, calculus) while the calculus students 
concenb-ate for the most part on calculus. It seems likely that the negative feelings 
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expressed by U.S. calculus students toward proving theorems can be attributed to their 
recent experience with proof in calculus. 

The remaining significant residuals eithei failed to form meaningful patterns or 
were not easily explained. For example, it is not dear why, in comparison to the other 
students. Grade 12 students in Ont. found memorizing rules and formulas (2) more 
important, relatively less difficult, and more to their liking. Nor is it dear why Grade 12 
students in B.C. particularly enjoyed investigating sequences and series (9), why U.S. 
Calculus students fotmd determining the probability of an outcome (13) difficult, or why 
Grade 13 students in Ont considered getting information from statistical tables (4) less 
important. Thus, except for the country-by-item effects explained by exposure to calculus, 
students from B.C., Ont., and the U.S. had similar perceptions of the topics and activities 
presented. 

Item differences . In contrast to the absence of cotmtry differences, there were 
several signifi&mt item effects. These items are shown in Figure 1. The median item 
scores listed at the zero effect point provide a reminder of the overall position of each 
item set. 
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Polarity for negatively worded items lias been reversed. 
Figure 1. Items Related to the Mathematics Curriculum 



1. For all three groups, the findings presented here are based upon the detailed 
results available from the authors. 
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f "^^^ °^ *^ considered most important were checking cin answer by 

going back over it (1) and memorizing rules and formulas (2). These two activities were 
also two of the three least liked. The importance assigned to these tedious, relatively 
disliked activities may be explained by the practical orientation of the students. As wiU 
be seen later, student responses to Mathematics and UtiUty items suggested that the 
students do not study mathematics because tiiey like it or find it intrinsicaHy interesting, 
but rather because of its practicality. 

The greater importance attached to solving equations (5) and solving word 
problems (3) is also consistent witii this practical view of matiiematics. The shidente did, 
however, react differentiy in tiieir assessment of tiie difticulty of tiiese activities; solving 
equations is easy, whUe solving word proKans is more difficult. This difference may 
reflect the greater complexity of solving word problems; a problem must first be 
underetood and conecfly translated into an equation which then must be solved. This 
notion tiiat more complex activities are p-rceived to be more difficult than simple 
straightforward activities can also be seen in the ratings of the d ficulties of the 
foUowing four activities: "proving theorems" (6) and "integrating functions" (l3)-more 
difficult; "getting information from statistical tables" (4) and "drawing graphs of 
functions" (ll)-less difficult (easy). 

The fifth most important activity, "using a hand-held calculator" (15), was also 
tile easiest and best liked activity of tiie 15 considered. When asked to react to specific 
issues related to calculators and computers, tiie shidents were equally, but judidously, 
enthtisiastic. They disagreed that calculators eliminated the need to learn to compute (2, 
aiculators and computers), and tiiey felt that calculators were not particularly useful in' 
learning different mathematical topics (3). The use of calculators did not ameliorate 
their dislike for solving word problems (4) (suggesting that the interpretation and 
h-anslation of word problems is what shidents most dislike). The shidents agreed that 
computers were beneficial (5^), and endorsed the suggestion that "everybody should 
learn sometiUng about cjmputers" (7). These findings are congnient with the 
prominent role played by calculators and computers in a modem technological society, 
reflect tiie practical orientation of tiie shidents, and are indicative of tiie strong emphasis 
beJng given to learning about and how to use calculators and, especially, 
mitTocomputers in today's schools. 

Personal Perceptions of Mathematics 
In general. Grade 12 shidents in B.C., Grade 12 and 13 shidents in Ont., and Pre- 
calculus and Calculus shidents in tiie U.S. shared the same perceptions of their parents' 
abilily in and support for matiienuiUcs, enjoyed to the same extent and felt equally 
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competent studying mathematics, felt tl\e same way about the Jitportance of 
mathematics in preparing for an occupation and the usefulness of mathematics in 
everyday life, and held common views about the mathematical capability of boys and 
girls. 

Two country effects exceed the minimum criterion for significance. Both Pre- 
calculus and Gdculus students in the U.S. were stronger in their view that mathematics 
is important in p eparing for a job and in solving everyday problems. Mathematics and 
the sciences enjo}' a relatively high profile in the US. Considered a world leader in 
scientific advances and industrial development, mathematics and science are 
continually stressed. National ills of the coimtry are often traced to the failure of the 
schools, and frequently to the failure of the schools to provide an adequate education 
and training in mathematics and science. The magnitude of involvement in like 
activities in Ginada and the competitiveness of Ginadians appears not to be as great. 

Thirteen of the 200 country-by-item effects were significant. Again, not all appear 
to be meaningful. Of the 13, only six could be reasonably explained. U.S. Calculus 
students perceived their mothers as enjoying mathematics less and as less capable of 
assisting them with their homework (2, 4, Home Support). Given that these students 
were studying calculus, and that fewer women than men in the past studied 
mathematics beyond senior high school, and therefore, calculus, these findings are not 
surprising. 

The U.S. Calculus students were more confident of their own ability to do 
mathematics (6, 11, Mathematics and Me), ard to become good mathematicians (12). 
Presumably among the most able students in school, they strongly looked forward to 
taking more mathematics (4). 
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ItemdtffCTCTCCT. The median polish yielded several significant item effects, 
particularly for items in the Home Support and Mathematics and Me scales. The 
significant items are shown in Figure Z 
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Polarity for negatlvety worded Items has l)een reversed. 
Figure 2. Items Related to Personal PercepUons of Mathematics 



As shown in the Home Support scale, the students felt that their parents 
considered mathematics to be an important subject for them (the students) to study (6, 
7), and that their parents encouraged them in their mathematical studies (8, 9). They did, 
however, feel that their parents usuaUy were not very interested in helping them with 
mathematics (5). They questioned the abiUty of both their fathere (3) and, especiaUy, their 
mothers (4) to do their homework, and indicated that their mothere tended not to enjoy 
mathematics (2). If these latter perceptions are accurate, then the observation that their 
parents, while supportive, are disinterested in assisting them with their work is 
understandable. It seems apparent that the students beUeved that the mathematics they 
were studying »vas beyond that studied by their parents. Still this did not appear to 
diminish the positive disposition of the parents toward their children's study of 
mathematics or their desire for their children to do well (9). 
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The students* perceptions about themselves were less clear. They wanted to do 
<yeU in mathematics (1/ Mathematics and Me). In general, they felt competent, but, with 
the exception of US. Calculus students, the students were uncertain that they could ever 
become good mauiematidans (12). They were also undecided as to whether they were 
looking forward to taking more mathematics (3). Furthermore, they were unsure about 
spending a lot of their own time doing mathematics (10) and working for a long time to 
tuKlerstand new ideas (14). Confronted with a problem they couM not solve, they 
reported that they felt *lost in a maze** from which they could not ftnd their way out 
(19). Yet, when they solved a problem, they felt good (4). Though mathematics did not 
make them **happy" (15), nor was it ''hm** (18), the students did not fear taking 
mathematics (16). 

Taken as a whole, these findings are not toe surprising. They are consistent with 
what woukl be expected from students who felt they *luid'* to take mathematics. The 
high retention rates ^nd graduation rjequirements of Canadian and American schools 
result in more students than just the academically able taking senior level mathematics. 
For the majority, mathematics may be more a means to an er d, and not an end in itself. 

This conjecture is supported by the effects observed for the Mathematics and 
Utility items. There was general agreement that mathematics was needed in everyday 
life (4, 5, 7). The students further agreed that knowledge of mathematics is necessary for 
most occupations (8), although they were not as sure that most mathematics had 
practical use on the job (6), or that most people actually used mathematics in their work 
(2). It appears the students believed that, in order to get a job, it was necessaiy to study 
mathematics, but what was actually covered was not always relevant to what was 
needed. Support for this interpretation can be seen in the differential importance 
assigned to some of the topics and activities of the mathematics curriculum. Moreover, 
this helps explain some of the indecision noted in the students* self-perception. 

The students displayed a high degree of support for the equality of boys and girls. 
They agreed tha^ a woman needs a career as much as a man (4, Mathematics and 
Gender), and that there were no differences between boys and girls in theu* ability in and 
need for mathematics. 

Student and Teacher Perceptions of Math natics as a Process 

The students were, in general, imcertain about the nature of mathematics as a 
field of study (median item score 3.42). Their teachers, ivhile not always consistent, were 
generally more decided (m = 3.74). No country effects were found, and, except Tor a 
consistent country-by-item effect which revealed teachers in B.C. were less rule oriented 
(5, 9, 10, 11), no meaningful residuals were observed. As before, there were item 
differences for both students and their teachers. 
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The teachers were somewhat inconsistent hi their view of mathematics as a 
changing field. They agreed that there had been recent discoveries in mathematics (12), 
but were undecided about changes in the near future (1). While more consistent, the ' 
students were essentiaUy undecided about whether or not mathematics is a chaneine 
field. ^ * 

Teachers generally agreed that mathematics provided the opportunity for 
originaUty (3, 8). The stadents were less sure. The shidents tended to disagree that 
learning mathematics involved mostly memorizing; the teachere dearly disagreed (8). 

Both teachers and students agreed that "mathematics helps one thing logically" 
(15). When asked if "mathematics helps one think according to strict mles" the teachew 
agreed, wWle the students were undecided (5). The shidents were dearly undedded 
about whether or not mathematics was a set of rules; their teachere tended to disagree 
(13). The students though, were more mle oriented in their solution of mathemaHcs 
problems (9, li). Somewhat contrad :tory to these rules, students tended to agree that 
trial and error can often be used to solve a problem, wWle their teachers were less 
decided (10). 
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Take&i together, these results suggest that, in general the teachers were more 
process oriented than their students* This finding is in keeping with the suggestion that 
the greater the experience, the greater the process orientation* But the lack of a process- 
oriented view of mathematics by the students is somewhat puzzling. As senior level 
students, they ostensibly have had a fair amount of experience in nuithematfcs. This 
leads to questions about the type of expetieiice they have and the way in which 
mathematics is taught It may be that the students, with their pr^ctkal orientations, 
focussed on answering a problem correcdy by the "right** rule, and that they cared little 
about how rules operate or from %/here they came Ml that was needed was tc know the 
right one and how to apply it Teachers, with more nathematics education and 
experience, appear to be more insightful al)out the derivation and use of rules. It seems, 
though, that their teaching may be less process oriented, with stress placed on a "right 
rule-right answer" approach. 

Summary 

Overall, the findings presented and discussed support the similarity nypothesis 
suggested in the introduction, and reflect a practical view of mathematics. Grade 12 
students in B.C, Grade 12 and 13 students in Ont and PiB-cakulus and Calculus students 
in the U.S. indicated practicality, and non-intrinsic worth, as the reason for studying 
mathematics. For the majoritv, mathematics appeared to be a means to an end, and not 
as end in itself. 

Consistent with this view, the studei\ts considered the 15 curriculum topics and 
activities presented to bo important, but they were imsiire of their difficulty and less 
likely to like them. The students indicated that, although they would take more 
mathematics, they were unwilling to commit much of their "own" time in studying 
mathematics, and felt uncomfortable with new problems. Instead, they saw mathematics 
not so much as a field involving speculation and conjecture, but as a field in which 
problems were solved by a *leamed, right" rule. 

These results are disappointing Itit understandable. It is to be hoped that students 
in senior mathematics class would have a more process oriented, somewhat less 
utL'tarian view of mathematics. This is not to say that practicality does not have a place; 
rather it is a question of ba^^nce. Why this balance was not more evident i-i attributable, 
at least in part, to the prevailing opinions heki by many that mathematics is a service 
cou!3e, and to the way in which it is likely taught. The mathematics curriculimi, as 
presently structured, favoius a more linear, systematic approach, with little room for 
considering the development of mathematics as a fieki of study. 
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If these perceptions are indeed accurate, then students wiU need not only a "how 
to do it" acquaintance with mathentatics, but also greater understanding of its pbce in a 
rapidly changing technological society, both in terms of its impact and its potential. 
Helping studeixts to explore the nature of mathematics, as weU as how to do it, is an 
importan* aspect of the development of a mathematically Uterate society. 
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Appendix 

Items and Scales 
MithsmitiisJiiSslifiQi 

1. Checking an answer to a problem by going back over it 

important undedded not notataU 

important important 

easy imdedded hard veryhaid 
like tudedded dislike dislike a lot 

2. Memorizing rulea and formulas, (response categories for aU remaining items in this 
group are as shown for the first item) 

3. Solving word problems. 

4. Getting information from statistical tables. 

5. Solving equations. 

6. Proving theorems. 

7. Using vectors. 

8. Working with complex numbers. 

9. Investigating sequences and series. 

10. Difierentiating functions. 

11. Drawing graphs of functions. 

12. Finding a limtt of a function. 

13. Integrating functions. 

14. Determining the probabiUty of an outcome. 

15. Using a hand-held calculator. 



w vfciy 
important 

(b) veiyeasy 

alot 



atems marked • in this and the remaining scales are negatively woided) 
•1. It is less fun to leam mathematical ideas if you use a hand-held calculator. 

Jrongly disagree undecided agree strongly 

agree 
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^ If you use a hand-held calculator you do not have to learn how to compute. I 

(Response categories for the remaining items in this and other scales are as shown I 

for the first item.) | 

3. Using a hand-held calculator can help you learn many different mathematical I 
topics. I 

4. Solving word problems is more fun if you use a luind-held calculator. I 
Computers solve problennA better than people do. I 

*6. Using computers makes learning mathematics more mechanical and boring. I 

7. Everybody should leam something about computers. I 

8. Computers do lots of good things for people. 1 

Home Support for MathemaHcs | 

t My father seems to enjoy doing mathematics. I 

2. My mother seems to enjoy doing mathematics. I 

3. My father would usually be able to do my mathematics homework problems if I 1 
asked him for help. I 

4. My mother would usually be able to do my mathematics homework problems if I I 
asked her to help. ] 

5. My parents are usually very interested in helping me with mathematics. I 

6. My mother thinks that learning mathematics is very important for me. I 

7. My fother thinks that learning mathematics is very important for me. | 

8. My parents encourage me to leam as much mathematics as possible. i 

9. My parents want me to do very well in matltematics class. I 

MathCTMttwggniMfi 

1. I really want to do well in mathematics. I 
Z My parents really want me to do well in mathematics. 

3. I am k>oking forward to taking more mathematics. 

4. I feel ^ood when I solve a mathematics problem by myself. 

5. I usually under stand what we are talking about in mathematics class. 
*6. I am not so good at mathematics. 

7. I like to help others with mathematics problems. 

If I had my choke I would not leam any more mathematics. 
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9. HedchaUenged when I am given a difficult mathematics problem. 
•10. I refuse to spend a lot of my own time doing mathematics. 
•11. Mathematics is harder for me than for most persons. 
•12. I could never be a good mathematician. 
•13. No matter .' dw hard I try I stiU do not do weU in mathematics. 

14. I will work a long time in order to understand a new idea in mathematics. 

15. Working with niunbers makes me happy. 
•16. It scares me to have to take mathonatics. 

17. I usually feel calm when doing mathematics problems. 

18. I think mathematics is fun. 

•19. When I cannot figure out a problem, I feel as though I am lost in a maze and 
caimot find my way out. 

Mathematics and Utility 

1. It is important to know mathematics in order to get a good job. 

•2. Most people do not use mathematics in their job. 

3. I would like to work at a job that lets me use mathematics. 

4. Mathematics is useful in solving everyday problems. 

*5. I can get along weU in everyday life without using mathematics. 

6. Most of mathematics has practical use of the job. 

*7. Mathematics is not needed in everyday living. 

*8. A knowledge of mathematics is not necessary in most occupations 

Mathematics and Cendgr 

*1. Men make better scientists and engineers than women. 

*2. Boys have more natural ability in mathematics than girls. 

•3. Boys have to know more mathematics than girls. 

4. A woman needs a career just as much as a man does. 



1. Mathematics will change rapidly in the near fuhire. 

Mathematics is a good field for creative people. 
•3. There is litUe place for originality in solving mathematics problems. 
4. New discoveries in mathematics are constantiy being made. 
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*5. Nlathematics helps one to think according to strict itUes. 

6. Estimating is an important mathematics skill. 

7. There are many different ways to solve most mathematics problems. 

8. Learning mathematics involves mostly memorizing. 

9. In mathematics, problems can be solved without using rules. 

10. Trial and error can often be used to solve a mathematics problem. 
*11. There is always a rule to follow in solving a mathematics problem. 
*1Z There have not been any new discoveries in mathematics for a long time. 

13. Mathematics is a set of rules. 

14. A mathematics problem can always be solved in different ways. 

15. Mathematics helps one to think logically. 



The research reported in this paper was supported by a grant from the Social Science 
and Humanities Research Coimdl of Canada (No. 410-83-0702). We would like to 
thank Robert Prosser for his able assistance in carrying out the data analysis. 
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The emergence of the modem nation-state and the emergence of mass education 
are closely intertwined. The development of modem nation-states relied, in part, upon 
several functions of formal schooling, such as the creation of citizens, the establishment 
of a legitimated system of economic and political allocation and the socialization of a 
labor force for a national economy. At the same time, agencies of the state provided 
resources for funding and chartering of educational expansion and, thereby , influenced 
the organization and content of educational activities. 

In this paper, we investigate an aspect of the relationship between the state and 
schooling, the state's control of the curriculum. We examine whether national siate 
regulation of the curriculum is related to curriculum implementation in the classroom. 

The linkages of macrosociological characteristics, such as state control, to 
microsociological characteristics, such as implementation of the curriculum, are seldom 
studied because of extensive data requirements. To examine such an issue, we have 
created a large comparative data set of 15 educational systems with information on the 
political incorporation of education as well as implemencation of curriculum in the 
classroom. 

Political Incorporation of Curriculum Cv^ntrol 

In assessing the relationship between the state and education, Ramirez and 
Rubinson (1979) contend that world-wide growth in state authority and power increases 
the political incorporation of education. They suggest that the political incorporation of 
education can explain several recent trends in education. 



This is an early draft of this paper. The final version will be published in Sociology of 
Education and that version of the paper should be cited. 
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One major trend is the world-wide expansion of formal schoolmg as measured by 
enrollment rates. Ramirez and Rubinson find that the state's authority and power are 
clearly related to growth in enrollmei ... in all public sectors (primary, secondary and 
tertiary) of schooling. They also argue that the political incorporation model generally 
fits results (torn other studies of the growth in enrollments (e.g., Boli, Ramirez & Meyer, 
1985; Meyer & Hannah, 1979) and that there is a lack of empirical support for either 
human capital or status conflict accounts of educational expansion (Rubinson, 1986). 

A second major trend is the growth in the number of educational systems w^th 
compulsory schooling laws. A recent study of the compulsory schooling laws in ihe 
19th century indicates a relationship between political incorporation and the passage of 
compulsory schooling laws (Ramirez & Boli, 1987). 

What has not been adequately studi3d is whether the poUtical incorporation of 
education influences educational activities in the classroom. We examine this issue for 
one significant educational activity, implementation of the curriculum in the 
classroom. Ramirez and Rubinson (1979) discuss tiie proposed research question as a 
needed critical test of the poUtical incorporation model of the relationship between the 
state and education. 



Official Curricvdum and the Implemented Curriculum 

There is a renewed interest among sociologists in the study of state conh-ol and 
the content of the official curriculum. For example, an area of considerable interest is 
'he changing content of national curriadum and the process by which a curriculai 
subject is defined and iiistihitionalizwi as a legitimate subject (e.g., Goodson, 1988; 
Goodson & Ball, 1984). These studies examine the historical development of the 
curricula, with an emphasis on how local politics shape the contents and definition of 
the official curricula (Apple, i979). From an institutional perepective, others have 
analyzed the increasing homogeneity in the subject composition of official curricula of 
national educational systenw fi-om 1920 to 1985 (Benavot & Kamens, 1989; Benavot, 
Kamens, Wong & Clia, 1988). 

Although these studies differ in their theoretical perspectives, they all focus on 
the pfficial currigula of schooling. The officia' curricula is part of an elaborate 
classification system that defines the appropriate categories of inshiiction. Schools 
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incorporate these categories into their organizational structitre and activities (Meyer & 
Rowan, 1978X If the official curriculum requires the study of mathematics, schools 
create departments of nuithematics, hire teachers of mathematics, and offer courses in 
mathematics. 

Schools tightly control and monitor being in compliance with the subject 
categories of the official curriculum (Meyer, 1983X School officials are concerned that 
the curriculum *fit* the state mandated curriculum. For example, they are concerned as 
to whether they offer the appropriate classes of algebra, world geography, and other 
subjects. By being in conformance with the categories of the official curriculum, schools 
maintain their legitimacy, gain access to resources and avoid sanctions, such as a loss of 
acaeditaticn (Meyer & Rowan, 1978). 

V\^ile schools tightly monitor their curricular offerings, there is variation in the 
degree to which there are organizational controls over the implementation of the 
official curriculum in the classroom. 

Organizational Controls Over Instruction 

Instruction is part of the technical activity of schools and or 9 of the educational 
outputs of scl^ooling. In systems that are more loosely coupled, sucli as in the i ^S., 
educational organizations exercise weak biu-eaucratic controls over instruction. This is 
because in these t>ystems the technical activities of schooling, instruction and learning, 
are buffered from inspection and assessment. Schools seldom attempt to assess these 
organizational outputs of schooling, in part, because of a lack of market pressures. The 
technical envii onments of schools in tb >se types of systems do net provide significant 
constraints as neither the survival nor profitability of the school is determined by the 
quantity or quality of instruction. While schools keep elaborate record3 of certain types 
of educational outputs such as attendance, course enrollments, and number of 
graduates, they seek to avoid inspection of im .ruction. Thorough and frequent 
inspections of instruction may reveal inconsistencies and inefficienci^is, thereby ci mating 
a challenge to existing organizational arrangements (Meyer, Scott & Deal, 198. \ 
Teachers, in these types of systems, have a great deal of autonomy and discretion ii. :he 
handling of instruction and learning. They often modify the official curriculum to meet 
their needs or those of their students, and, therefore, teachers teaching the same subject 
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Within a school may differ in the amount of material covered, the type of topics covered, 
the amount of time spent on instruction, and the use of curricular materials. 

In other educational systems, the control of the curriculum and its 
implementation is greater. Classroom processes in these systems are less buffered from 
external influence. The technical environments of these schools are more clearly 
defined and influence larger segments of educational acHvities. 

We argue that the degree to which an educational system is incorporated into the 
state will influence the degree to which the technical environment of schooling is 
controUed. The more incorporation with the state the more control and less autonomy 
at the classroom level. There are a host of mechanisms through which the state can 
control the implemented curriculrmi. These range from concrete forms of social 
control, such as state inspection, monitoring teacher training and formal assessment of 
student achievement, to more indirect forms of control, such as shaping the deHnitions 
of instmction and socialization of teachers. Although we do not measure these 
mediating mechanisms here, we can assess the presence or absence of their combined 
influence on the implemented curriculum. 



Stato Control of the Curriculum and Implemented Curriculum 

Educational systems vary in the degree of political tocorpoiation of curricular 
subjects and their content. In some educational systems, control over curricular issues is 
highly centralized and managed at the naHonal ministry of education. In other 
educational systems, curricular issues are dealt with at the provincial or local level. The 
degree of poUtical incorporation of curriailar mattere affects the degree of 
environmental specification of instruction. 

If state control over the curriculum is located at che national level, the 
environmen; is less complex and there wiU be greater specification of instruction. 
Through the ministry of education, or some administrative counterpart, theie is an 
adminish-ative mandate for what the curriculum should be. Such a mandate may be 
reflecf.ed in the curricular guidelines, the training of teachers, the content of curricular 
matf riak, and items on studeni achievement tests. 

The national educational agency also may institute a set of bureaucratic controls 
to assure implementation of the curriculum. For example, state hisp-ctors may 
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occasiomlly visit classrooms to assess the content of instruction or academic 
achievement tests may be used to determine how students are allocated to classes and 
curricula. The effects of such bureaucratic controls on classroom instruction, however, 
are not well documented and may create little more than procedural compliance. 

In educational systems with local political control of the curriculum, the 
environment of teaching is more complex and there is less specification of instruction. 
The administrative mandate as to what teachers should teach is weaker; there will be 
greater diversity in the textbooks available for use by the teachers; and there will be 
greater diversity in the types of training available for teachers. Since schools receive 
local funding and rely upon conununity support, they are likely to be more responsive 
to local constituencies. 

This discussion about state control technical environments and the 
implemented curriculum suggest two hypotheses about political incorporation. 

Hypothesis 1: The greater the degree to which education is incorporated within the 
state the greater control over the implementation of curriculum in the classroom, 
which will be reflected in more tmiformity across implementation by teachers within a 
system. 

Hypothesis 2: Political incorporation simplifies the technical environments of 
schools, thus in highly incorporahxi systems local factors of classrooms will not 
influence the implemented curriculum. 

Data and Methods 

Testing these hypotheses requires detailed data about classroom instruction in 
educational systems that vary in the degree to which education is politically 
incorporated. The Second International Mathematics Study (SIMS) undertaken by the 
International Association for the Evaluation of Educational Achievement (lEA) 
provides this type of data. This large data set represents a powerful analytic resource for 
cross-national study of education. The countries in which SIMS collected data represent 
a diverse set of societies in terms of their size, geographic location and level of 
development. The use of a standard sampling procedure witfiin each country yielded 
high quality samples of classrooms. Extensive efforts were undertaken to assure that 
comparable data collection procedures were used in each educational system. 
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The SIMS data were coUected in 20 educational systems.! Of the 20 educational 
systems represented in the SIMS data set, 15 had tuU classroom process questionnaires .2 
In each educational system, a four step, stratified-random sample of 8th grade 
mathematics classrooms were drawn3 This yielded over 2200 classrooms. Forea 
class, detailed information was collected from the teacher about the amount and type of 
instruction in mathematics during the year. For 157 items in mathematics, each teacher 
was asked whether or not they had taught such an item during the year. Teachers in 
each educational system were asked Uie same iiuormation about the same 157 items in 
mathematics. 

For each educational system, a board of educational experts designated which of 
the 157 items in mathematics were part of the national curriculuTi in mathematics for 
8th grade.4 How much this so-called national curriculum overlapped with official 
curriculum in various parts of each educational system was not evaluated by SIMS. At 
the very least, the measure of national curriculum, which we used here, represents the 
largest possible set of matiiematics skills that an 8th grade teacher would cover on 
average in tiie course of tiie year. 

Description of Measures 

The political incorporation of education, as Ramirez and Rubinson (1979) define 
it, refers to the extent of national conh-ol over schooling. They suggest that a valid 
measure of political incorporation is the level of political control over education. The 
more that conh-ol occurs at tiie national level, the more schooling is politically 

^ Wc analyze national educational systems, except for Canada, which 
collected data separately in British Columbia and Ontario. Because of some minor 
diiiercnces in data collection in these two provinces, we analyze them separate!, 
SIMS in Hong K g, Scotland. French Belgium and Nigeria did not include' 
questions about the implementation of curriculum. The Flemish Belgium sample 
did, and wc will use it to represent Belgium. Swaziland was dropped from the 
analysis because -nly one-fifth of the teachers completed this part of the 
instrument. *^ 

^ See Garden (1987) for a detailed description of the SIMS study. 

In each country this board was made up of representatives from the 
ministry of education, the teacher's union, teachers and school district level 
administrators. The panel was >«ked to assess which of the items from the item 
pool would mosUy likely be part of the standard 8th grade mathematics 
cumculum in their country. The Japanese ministry decided that the items were 
too easy for the bulk of its 8th grade students so 7th grade classrooms were 
sampled. 
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incorporated with the state. As an Indicator of this construct we have slightly modified 
a scale developed by Ramirez and Rubinson (1979). We used a seven point scale and 
ranked each country in terms of the political level that had the ^ :^atest control over the 
curriculum: 1) local control, 2) local and provincial control 3) provincial control 4) 
local, provincial and national control 5) local aiui national control 6) provincial and 
natioiud control and 7) national control In coding each system on this scale, we 
consulted standard reference sources (International Enofclopedia of Education, 1985; 
International Handbook of Educational Systems, 1983) as well as an lEA publication with 
descriptions of the educational systems (Travers k Westbury, 1989). Three raters 
independently scored each educational system on the scale. The level of agreement 
among the three raters was above 98%. 

From the SIMS data we constructed several indicators of different dimensions of 
the implemented curriculum. First, we took the number of items in the national 
curriculimt (as determined by the panels of educational experts) as an indicator of the 
size of a system's official mathematics curricultun. Second, for each educatioiuil system, 
we calculated the percentage of the national curriculum that a teacher taught during the 
year and calculated a mean and standard deviation as indicators of the amount of 
curriculum covered in the system and the variation in the amoimt of curriculum 
covered. Third, we calculated the percentage of teachers in each educatiozud system who 
taught each of the items in the national curriculum. As an mdication of agreement 
among teachers* implementation of curricultmt we counted the number of items that 
were taugnt by either 90% or more of the teacher or 10% or less of the teachers. 

Finally, we have measures of local factors which might influence the 
implementation of the curriculum for each class such as the range in the mathematics 
abilities of students, the level of mastery of mathematics, the age and sex of the teacher, 
niunber of years the teacher has been teaching as well as teaching mathematics. We also 
have measures of the number of periods of mathematics per week and the average 
length of a mathematics period. 

Analysis Plan 

First, we correlated measures of various dimensions of the implementation of 
curriculum with the indicator of state control of the curriculum. Next, we used a model 
of teacher coverage of the national curriculum and estimated this model with each 
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system's data. There are several advantages in doing this type of aralysis which is a 
standaid approach to analysis of student or classroom data and national factors 
(Heyner-an k Loxley, 1982; 1983). Since our hypotheses are about relationships between 
institutional characteristics of systems, we required indicators of curriculum coverage at 
the system level and therefore we do not combine all classrooms into one sample. This 
approach allows our analysis to incorporate differences in the size and natur« of the 
national mathematics curriculum in each system. It also aUows us to handle some of 
the minor differences in questionnaires and procedures that are almost inevitable in a 
comparative study of this size and complexity. 

Results 

In the first column of Table 1 are measures of the size of the 8th grade 
mathematics curriculum in each educational system. While aU of the educational 
systems in the sample had 8th grade mathematics, the size of their curriculum varied. 
The sample mean was 125.1 items (or 80% of the 157-item pool), with a standard 
deviation of over 16 items. The range in size was substantial. Three educational 
systems (New Zealand, Japan and Hungary) had a large curriculum that covered 
approximately 140 items (or over 90% of the 157-item pool). At the lower end, Belgium 
(Hemish) and Luxembourg had curricuL that covered approximately 95 items (or only 
60% of the 157-item pool). 

The second column in Table 1 shows the mean number of items of the 
cuiTiculum that were taught during 8tn grade by each system's teachers. Here there is 
considerable variation with a standard deviation of 20 items and a range of over 70 
items. Japanese teachers taught the most, with a mean of 117.2 items (or 75% of the 157- 
item pool), while Canadian (British Columbia) taught the least, with a mean of 42.7 
items (or only 27% of the 157-item pool). 

The third column in Table 1 is the mean number of items taught as a percentage 
of the total number of curricular items. In none of the educational systems studied did 
the "average teacher" cover the entire 8th grade cun ulum. The sample mean is 65% 
with a standara deviation of over 15 percentage points. There is also a large range in 
coverage, with teachers in Belgium (Hemish) and Japan providing irstruction for over 
80% of their curricula and teachers in British Columbia and The Netherlands providing 
instruction for under 45% of their curriculum. 
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TABLE 1 

Curriculum in Mathematirs 

Standard 



Educational 


Size of 
national 

/*1 1 Pfl till TV^ 

cu zncu ium 


# Items 

laU^ni 


deviation 
of items taught 


% National 
curriculum 
taught 


TT G 


128 


93.6 


20.5 


73.1 


cngianci 


146 


98.7 


26.8 


67.6 


ine iMetnerianas 


127 


55.2 


16.2 


43.5 






81.0 


15.0 


85.3 


New iieaJana 


148 


98.9 


21.3 


66.8 




127 


42.7 


16.0 


33.6 


Lanada (Untano) 


118 


87.1 


16.6 


73.8 


Finland 


124 


81.5 


15.6 


65.7 


France 


108 


84.6 


7.9 


78.3 


Hungary 


142 


65.9 (86.0)3 


35.8(26.3)3 


46.4(60.0)3 


Israel 


118 


70.0 (6Z0)l> 


225(19.1)l> 


59.3(5Z5)l> 


Japan 


146 


117.2 


10.0 


80.3 


Luxemburg 


97 


71.7 


10.9 


73.9 


Sweden 


122 


60.1 


13.9 


49.3 


Thailand 


131 


103.2 


15.4 


78.8 



a Classrooms only in the Budapest area. 

b Classrooms only in the Reformed system i7"9?:h grade). 



Even though all of the educational systems had 8th grade mathematics as a 
curricular subject, the data in Table 1 indicate that there is variation among these 
educational systems in the content of their mathenuitics curriculum. Also, the amount 
of instniction varies considerably across educational systems. While the school 
ciuTicula may have become institutionalized at tiie world level, our data suggest that 
there remains systemic variation in content and instruction.5 



^ Our analyses of these data do not indicate a ranking of an educational 
system's overall efficiency in mathematics instruction. We interpret the ranking 
only as an indication of variation in the "size" of and "conformity" to the official 
curriculum. 



228 



Our first hypothesis predicts that teachers in educational systems with state 
control of the curriculum Jit the national level would be more unifonn in their 
implementation of the curriculum in the classroom. The results displayed in Table 2 
indicate that in educational systems in which there is state control of the curriculum at 
the national level there is a modest tendency for more uniformity in the niunber of 
items that teachers teach. The correlation between an educational system's standard 
deviation in the mean number of items taught and state control is negaHve and 
significant, but only after we make a minor correction for the Hungarian and Israeli 
samples. There is a stronger association between the minimum nuiriber of items taught 
in a classroom in an educational system and our indicator of state control. Education 
systems with state control of the curricUum at the national level tend to display less 
variation in the amount of instruction and do not have teachers who teach little of the 
curriculum. 

TABLE 2 

CprrelattQPS getween PoliHcal incorpo ration and Implemented Cy rrinihiTT^ 



Mean 
number of 
items in 
national 
curriculum 
taught 

•.10(..7)a 



Standard 
deviation of 
mean n^-mber 
of items in 
national 
curriculum 
"aught 



Least number of 
national 
curriculum 
items covered 



Number of 
national 

curriculum 
items taught 

by <10% or 
>90% of 
teachers 



Number of 
national 

curriculum 
items taught by 
<10% or >90% 

of teachers 



Percentage of 

national 
curriculum 
items taught 
by>90%of 
teachers 



-.27 (>.48^)a .46^ (.58**)a .39** (.59*) .47** (.49**) .45** (.45»») 



** p< .05 

a Coefficien ts in parentheses calculated with partial Israel and Hungary samples. 

We also suggest that teachers in educational systems with state control of the 
curriculum at the national level would be more likely to teach the same material. To 
examine this issue, we constructed three indicators of the similarity among teachere in 
their classroom instruction and correlated these indicator wiih our measure of state 
control of auTiculum. The first two measures are the nux. ^r of items that 10% or less, 
or 90% or more, of the teachers in an educational system taught* These two measures 
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indicate the extent of agreement in instruction among teachers. The first measure 
indicates the extent of agreeirient in coverage during the 8th grade year and the second 
measure indicates the extent of agreement during both the 7th and 8th grade years. Both 
of these agreement measures are moderately correlated with the level of state control of 
curriculum. 

We constructed a third indicator of agreement that takes into account the 
variation in the size of the mathematics curriculum. We divided the number of items 
that at least 90% of the teachers taught in 7th or 8th grade by the number of items in the 
curriculum. The correlation between this measiure and state control is similar in 
strength to the item counts. For each indicator, the analyses suggests that teachers were 
more likely to teach the same material if they taughl in educational systems with 
national state control of the curriculum. 

Our second hypothesis predicts that local factors will influence classroom 
instruction in educational systems with state control of curriculimi at the local c 
provincial level Tu examine this issue we regressed the mean percentage of the 
national curriculum covered in 8th grade on indicators of local factors. The same 
equadon was estimated for each sample of teachers and they are reported in Table : 

If our description of the effect of state control are correct, we should find that the 
regression equations for educational systems with state control at the local or \. x)vincial 
level are significant. All of the educational systems with local curricular control had 
significant equations, while only two with national level control (Finland and Sweden) 
had significant equations. The correlation between the rr.^asure of state contrc oi .he 
curriculum and the squared multiple correlation coefficients resulting from the 
equations is -.67 (p = .003). 

Among those educational systems with local state control of the curriculum, 
local factors accomtt for from a low of 9% in the variation of instruction in The 
Netherlands to a high of 24% in England and Wales. In these educational systems a 
range of local factors predicted instruction. Teachers in these systems seem to be 
particularly sensitive to student resources within classrooms, both in terms of the 
average level of mathematical mastery of the class and the diversity of ability within the 
class. Following these factors, the amount of the instruction depends on time resources, 
both in terms of the * 'imber of mathematics sessions and the length of these sessions. 



TABLE 3 

QLS Regression of I.ncal Fannrs on Implemented Curriniliim 



ERIC 



Educational system 
decentralized above 
the line 


N 


r2 


F 


Intercept 


Student 


resources 


Significant Local Factors 
Teacher resources 


Time 


resources 












Range 
of class 


Mastery Age 
of class 


Sex Experience Experience 
teaching teaching 
mathematics 


Periods 
per week 


Average 
length of 
periOQ 


U.S.A. 

England & Wales 
The Netherlands 
Belgium 
New Zealand 


253 
204 
206 
120 
151 


.10 
.24 
.09 
.16 
.19 


3.1 
6.7 
2.5 
2.3 
3.8 


.61^ 
.17 
-.03 
.47 
.27 


-.041 
.031 


.0014 

.002 -.y^'^e 
.0014 


.0084 

.0075 .0074 


.037 


nn^ 1 


Canada (BC) 


73 


NS 
















Canada (Ontario) 


126 


NS 
















Finland 


176 


.12 


2.7 


.45 




.0008 








France 


286 


NS 
















Hungary 


56 


NS 
















Israel 


85 


NS 
















Japan 


193 


NS 
















Luxemburg 


79 


NS 
















Sweden 


172 


.09 


2.8 


.28 




.0014 








Thailand 


80 


NS 
















^ The regression 


coefficients are 


unstandardized and 


signiaoant 


ai least with p > 


.05. 
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In Sweden and Finland/ the two educational systems with national state control 
of the curriculum and. significant equations, the overall level of mathematical mastery 
is the only significant variable. In both coimtries, there are numerous ability tracks in 
the 8th grade and classes in these tracks, by central administrative definition, shouli 
receive different amounts of mathematical instruction.^ 

We can explore two possible statistical artifacts within these results. One is the 
lack of significant regression equations for the educational systems with state control at 
the natioT^al level could result from a lack of variation in local factors. Educational 
systems with national control of the curriculum could also be the kind of educational 
systems that equalize between<lassroom factors. The between-dassroom local factors 
could be so similai that the non-significant equations result from a lack of between- 
classroom variation in local factors. To examine this possibility, we correlated the 
standard deviations of each of the eight indicators of local factors with the measure of 
state control of the curriculum. All of these correlations are small and not significant, 
except one. The exception is that educauonal systems with national state control of the 
ctuiiculum tend to diminish the between-classroom variation in the number of 
mathematical instruction sessions per week (-.74, p=.00i). In general, the between- 
classroom variation in local factors does not vary by the level of state control of the 
curriculum. 

A second consideration is whether there is a spurious correlation between the 
level of state control of the curriculum and ouv various indicators of instruction. We 
have examined bi-variate associations at the system-level and there could be other 
system-level factors which might either mediate or negate the correlations that we 
report. 

We examine lour such factors - two indicators of the economic development of 
the country (1980 Gross N^ational Product and Gross Domestic Product), and two 
indicators of the size of the educational system (tLe population m 1980 and the gross 
primary enrollment ratio in 1980). We took the natural logarithm of each of these 
indicators and calculated a partial correlation between the measure of state control and 
the various indicators of instruction controlling for each of these factors. 



^At this educational lcvcl» Finland has three ability tracks by classroom in 
mathematics (the short course^ the long course^ and the heterogeneous course) 
and Sweden has two ability groups by classroom (general and advanced). 
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Neithei the mdicators of ihe level of development or size correlated with the 
level of sta'.e control of the curriculum. Though there is a slight tendency for the largei 
educational systems to have local state control of the curriculum, the correlation is not 
statisticaUy significant/ In each of the partial correlaHons the pattern of correlations 
between state control and instnjction did not change after controlUng for these other 
variables. 



Discussion 

Educational . /stems are linked to the state in a variety of ways and this is 
reflected in the degree of poUtical regulation of education. For some educational 
systems there is national r*ate regulation of educational activities through a ministry of 
education. While for other educational systems, educational activities may be 
unregulated or regulated at the local or provincial level. Our results indicate that this 
variation in state regulation of the curriculum is related to the implementation of the 
curriculum in the classroom. In educational systems with strong national control of 
curricular issues, we found that teachers were likely to teach the same material in the 
classroom. If there was local poUtical control of curricular issues, the amount that 
teachers taught was determined by local factors. 

These findings support the value of the poUtical incorporation model as a 
general homework for exanuning the relationship of the state and education. ITie state 
can influence far more than the supply of educational opportunities and the chartering 
of schools. As we have shown, qualities of the state also can influence the core technical 
acti\dties of schooUng, classroom instruction. This lends additional credibility to studies 
of the state's role in forming the official curriculum. The influences of the state run 
from the creation of the official curriculum to its implementation in the classroom. 

The state's control of curriculum and its implementation may increase 
worldwide. National poUtical incorporation is fueled by internal processes of the state as 

■'The correlations between level of economic development and curricular 
coverage, and the size of the educational system and curricular coverage were 
generally not statistically significant. There are two implications of these 
imdmgs. First, the lack of associations raises questions about hypotheses which 
suggest that curriculum coverage may be sensitive to economic and technical 
development (for example see Bcnavot & Kamens, 1989). And secondly, the lack of 
associations suggests that some stnictural characteristics of educational systems 
may not influence curricular implementation. 
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well as external forces. For example, consider the recent national and international 
discussions of the relative effectiveness of nations' educational systems (Lapointe, Mead 
& Phillips, 1989; McKnight, 1987). This debal> illustrates the trend to consider student 
achievement as a national resource that should be, therefore, officially monitored by the 
state. These concerns encourage greater national political incorporation. A: '^e 19th 
century state was concerned with expanding schooi enrollments and attendance, Ae late 
20th century stato is concerned with student acl .ement and teacher effectiveness. 

Our findings have implications for several other lines of research. Observations 
about the weakness of organizational controls on classroom activities have perhaps 
overlooked the variation in political incorporation and its influence on the technical 
environment of schooling. Our results indicate that the degree of national 
incorporation of education is clearly related to classroom level activities; and 
institutional perspectives need to consider these findings. 

These findings also have implications for the study of the relationship between 
schools and their environments. Results from other studies indicate that organizational 
characteristics of schools are related to characteristics of their environment. For 
example, the degree of administrative complexity in American public schools (Rowan. 
1982) end public school districts (Meyer, Scott & Strang, 1987) is lelated to the degree o; 
fragmentation of the environment and the fonruil structxuing of environmental factors. 
This line of resear h has not, however, examined the relationship between the 
environment and educational outcomes. Our findings indicate that the complexity of 
the environment, as measured by the degree of national state regulation of the 
curriculum, is related to a significant educational outcome, classroom instruction. 

We have examined only one curricular subject and additional research needs to 
be done on other areas of the curriculum. We predict that the relationship between 
political incorporation and implementation of the curriculum should be stronger for 
other subjects, such as civics and social studies. The content of these subjects is of greater 
interest to the state since they help to shape public definitions of citizenship and civic 
culture. 
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The Second Jitentational Mathematics Study College Algebra Classroom Process 
Data for Population B were examined: (a) to study reasons cited by teachers for teaching 
subtopics, (b> to study reasons cited for selecting particular content representations, and 
(c) to determine what relationships -xist, if any, between teachere who use multiple 
content representations and their teaching decisions, professional opinions, 
backgrounds, classes, and schools. The major results^ of this analysis are derailed in 
Content Representation in College Algebra: Summary Report. 

Briefly, these results were: (a) External reasons and teacher famUiarity frequently 
were cited as reasons for and agairci teaching particular topics in complex numbers and 
logarithms. Additionally, content reasons frequently are reported as reasons why a topic 
should be taught. Qosely paralleUng reasons for topic coverage, external reasons and 
teacher familiarity frequently were reported as reasons for and against using a parHcular 
concept representation and content reasons frequently were reported as reasons why a 
representation should be used. However, only for concept representation, easy to 
understand also was frequently reported as a reason for using a particular 
representation. For both subtopic coverage and concept representation, easy to teach and 
enjoyed by students were not often reported as reasons, either pro ov con. 

There were significant relationships between the use of multiple representations and 
teacher development and use of supplemental materia*^. The xe also was a relationship 
between multiple representation use and sources of information used to decide what to 
teach, how to teach, and what applications to present. Together these relationships 
suggest that teachers who use multiple representations {MKT) use more sources of 
information (self-developed materials, minimum competency statement, text, or 
syllabus) than do nonmultiple representation teachers (non-MRT). 

Teachers who use multiple representations also aUot more time for a topic and they 
are more like:, to cover important formulas and theorems more deeply than 
nonmultiple representation teachers. There also was some evidence of a relationship 
between teacher experience/education and multiple representation use, but inferences 
can not be drawn at this time. 

Purpose of Technical Appendix 

The purpose of the Technical Appendix is to discuss briefly the isolated, statistically 
significant results of this study. Because this research was a first pass through the data 

^ Major results are results that: (a) were supported by statistically significant relationships 
rn^T. f ? """'"P'^ representation indices and a variable aM (b) had additional support 
r?Zt TI^"^ Z ''^"^"^"y significant relationship between at least one index and another 
Closely related vanable. 
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exploring relationships between multiple representation use and all the other classroom 
process data collected by SIMS, it is to be expected that statistically significant 
relationships will exist due to chance alone . Therefore, these ^^-^ts should not be 
interpreted as a picture of multiple representation use and te^ ±er/ school 
characteristics, but rather they should be interpreted as a watercolor sketch of these 
relationships. 

Multiple Representation Use and Algthn Classroom Variat les 

As the Summar^ Report noted, teachers «vho used multiple representations were 
more likely to allot more time for the topic aiid cover more material more deeply than 
other teachers. Tables 1-3 provide further evidence of this - MKTs covered more topics 
in complex number^. As might be expected, MKTs also had more reasons for covering 
the material and fewer reasons for NOT covering the material (Tables 4-17). Table 18 
confirms results in the Summaiy Rqwrt by demonstrating that MRT cover a topic from 
complex numbers more deeply. This is consistent with the previously noted results that 
MKTs spznd more time on the material. Tables 19-25 replicate the same results for 
teaching Icgarithms: MKTs allot more time, cover more material, cover the material 
more deeply., and have more reasons for covering the material. 

Multiple Representation Use and Teacher Variables 

There was a him the Summary Report that MRTs were better prepared 
professionally than non-MRTs. Tables 26 and 27 suggest that MRTs call on more 
students in a class period and spend more class time presenting new naterial than non- 
MRTs. The results in Table 28 appear to be random. Tables 29-41 illustrate statistically 
significantly relationships between MRTs and teachers* objectives and sources of 
information. Overall, the only pattern that begins to emerge is that MRT's are more 
likely to see a balanced variety of objectives and use a balanced variety of sources of 
information. This suggests that MRTs take a more reasoned, balanced approach to their 
teaching than non-MRT*s. Interestingly, non-MRTs never use many sources of 
information. 

MRTs also are more likely to divide the class into smaller groups (Table 42). Non- 
MRTs were more likely to assign the same homework to all students and they were 
more likely to blame the lack of student progress on the students themselves (Tables 43- 
46). The results given in Tables 47-49 do not fit any overall framework other than 
MRTs appear to be more reasonable and less dogmatic in their approaches than non- 
MRT's. 
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Multiple Represcittation Use and School Variables 

As noted in the Summary Report, teachers' use of multiple representations are not 
significanUy related to any school variables except the two listed in Tables 50 and 51 - 
school days per year and type of overall curriculum. Because oi the lack of supporting 
variables, these results are attributed to chance. 

Discussion 

As noted above, these results HU in the picture of multiple representation use 
painted by the Summary Report, although the paints used are watercolors rather than 
oils. The results f.i-esented in the Technical Appendix suggest that teachers who use 
multiple representations cover more topics more deeply than teachers m ho do not use 
multiple representaHons. As one might expect, MKTs have more reasons for covering 
the material than non-MRT's and MRT's have fewer reasons for not covering topics. 
Also, these data hint that MRTs are more likely to avaU themselves of different 
information sources than non-MRT's and they are more balanced in their views of 
mathematiu, mathematics teaching, and mathemaHcs learning. Of course, confirming 
these suggestions and hints would be an appropriate topic for further research. 



CONTENT REPRESENTATION; TECHNICAL APPENDIX 

TABLE 1 

TABLE OF COMPLEX USED BY ATTCXNN 

COMPLEX USED ATTCXNN( 83 .TAUGHT NEW GRa'. i COMPLX 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 


1 COVERED 


INOT COVEI 
IRED 1 


TOTAL 


USED <= 1 


1 1U 
1 12.28 
1 53.85 
1 15.38 


1 12 
1 10.53 
1 U6.15 
1 52.17 


1 
1 
1 
1 


26 
22.81 


1 < USED <= 2 


1 26 
1 22.81 
1 89.66 
1 28.57 


1 3 
1 2.63 
1 10. 3U 
1 13. OU 


1 

i 
1 
1 


29 
25. UU 


2 < USED 


1 51 
1 VU.7U 
1 86. UU 
1 56. OU 


1 8 
1 7.02 
1 13.56 
1 3U.78 


1 
1 
1 
1 


59 
51.75 


TOTAL 


91 
79.82 


2? 
20.13 




11U 
100.00 



FREQUENCY MISSING = 7 



STATISTICS FOR TABLE OF COMPLEX USED BY ATTCXNN 



STATISTIC 



CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL :H I SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



OF VALUE PROB 



2 1U.239 0.001 
2 12.632 0.002 
I 9.266 0.002 

0.353 

0.333 

0.353 



EFFECTIVE S. .PLE SIZE « 11U 
FREQUENCY MISSING » 7 
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CONTENT REPRESENTATION: TECHNICA'. APPENDIX 
TABLE OF COMPLEX USED BY ATTPOLN ^'^^^^ ^ 
COMPLEX USED 



FREQUENCY 








PERCENT 








ROW PCT 








COL PCT 


1 COVERED 


INOT 


COVEI 
1 






IRED 


USED <= 1 


1 n 


1 


15 1 




1 9.91 


1 13 


.51 1 




1 42.31 


1 57 


.69 1 




1 1U.U7 


1 U2 


.86 1 


1 < USED <= 2 


1 22 


1 


7 1 




1 19.82 


1 6 


.31 1 




1 75.86 


1 24 


.14 1 




1 28.95 


1 20 


.00 1 


2 < USED 


1 43 


1 


13 1 




1 38. 7U 


1 11 


.71 1 




1 76.79 


1 23 


.21 1 




1 56.58 


1 37 


.14 1 


TOTAL 


76 




35 




68. U7 


31 


53 



ATTP0LN(88. TAUGHT NEW POLAR COORD COMP NUM) 



TOTAL 

26 
23.42 



29 
26.13 



56 
50.45 



111 
100.00 



FREQUENCY MISSING =: 10 



STATISTICS FOR TAULE OF COMPLEX USED BY ATTPOLN 



STATISTIC 



DF 


VALUE 


PROS 


2 


10.771 


0.005 


2 


10.202 


0.006 


1 


8.158 


0.004 




0.312 






0.297 






0.312 





CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAE.NS2EL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



EFFECTIVE SAMPLE SIZE =: 111 
FREQUENCY MISSING = 10 



ERIC 



f) 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 3 

TABLE OF COMPLEX USED BY ATTDEMN 

COMPLEX USED ATTDEMN (93 .TAUGHT NEW DEM'* VRE'S THRM) 



FREQUENCY 

D rp prWT 
r trxUtri 1 

ROW PCT 

COL PCT 


1 COVERED 


INOT COVEl 
IRED 1 


TOTAL 


USED <= 1 


1 12 
1 10.53 
1 U6.15 
1 15*00 


1 1U 
1 12.28 
1 53.85 
1 U1.18 


1 
1 
1 
1 


26 
22.81 


1 < USED <= 2 


1 2U 
1 21.05 
1 82.76 
1 30.Cr> 


1 5 
1 U.39 
1 17. 2U 
1 1U,71 


1 
1 
1 
1 


29 

25. 


2 < USED 


1 

1 38.60 
1 7U.58 
1 55.00 


1 15 
1 13.16 
1 25.U2 
1 UU.12 


1 
1 
1 
1 


59 
51-75 


TOTAL 


80 
70.18 


3U 
29.82 




11U 
100.00 



FREQUENCY MISSING = 7 



STATISTICS rOR TABLE OF COMPLEX USED BY ATTDEMN 



STATISTIC 


DF 


VALUE 


PROB 


CH 1 -SQUARE 


2 


9.908 


0.007 


LIKELIHOOD RATIO CHI -SQUARE 


2 


9.U85 


0.009 


MANTEL-HAENS2EL CHI -SQUARE 


1 


4^.908 


0.027 


PHI 




0.295 




CONTINGENCY COEFFICIENT 




0.283 




CRAMER'S V 




0.295 





EFFECTIVE SAMPLE SIZE = llU 
FREQUENCY MISSING = 7 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX FREQ BY XPOSRTS ^^^^^ ^ 



COMPLEX FREQ 



FREQUENCY 
PF.RCENT 
ROW PCT 
COL PCT 



FREQ <= 1 



FREQ >= 1 



XPOSRTS - NUMBER CF POSITIVE REASONS 
TEACHING COMPLEX ROOTS 



<= 2 P0SI3 <= 
ITIVE REIS <= 



30 I 

2U.79 I 23 

38.96 I 36 

78.95 . 62 



8 I 

6.61 I lU 

18.18 I 38 

21.05 I 37 



TOTAL 



38 
31. UO 



37 



REAU <= REAI 
U I SONS I 
--—+--—•-—+ 

28 I 19 I 

.I** I 15.70 I 

.36 I 211.68 I 

.22 I 50.00 I 

17 I 19 I 

.05 I 15.70 I 
.6U I U3.18 I 

78 I 50.00 I 
+ -+ 

^♦5 38 
19 31.40 



TOTAL 

77 
63. 6U 



36.36 



121 
100.00 



STATISTICS FOR TABLE OF COMPLEX FREQ BY XPOSRTS 



STATISTIC 



DF 



VALUE 



PROB 



CHI -SQUARE 

LIKELIHOOD RATIO CHI-SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



9U2 
167 
880 
2U0 
233 



0.2U0 



0.031 
0.028 
0.009 



SAMPLE SIZE = 121 



CONTENT REP.ESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX FREQ BY XNOTRTS ^^^^^ ^ 



COMPLEX FREQ 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



FREQ <= 1 



FREQ >= 1 



XNOTRTS - NUMBER OF REASONS NOT MARKED 
TEACHING COMPLEX ROOTS 



SONS 



REAI5 <= 
NOT IS <= 



15 
50 



19 I 

.70 I 23 

.68 I 37 

.00 I 63 



15 
U3 
50 



19 I 

.70 I 1U 

.18 I 38 

.00 I 36 



TOTAL 



31 



38 

.UO 



38 



REAI7 <= REAI 

6 I SONS I 

+ + 

29 I 29 I 

•97 I 23,97 I 

•66 I 37,66 I 

•OU I 78,38 I 

17 I s'l 

.05 I 6,61 I 

.6U I 18,18 I 

.96 I 21,62 I 

U6 37" 
.02 30,58 



TOTAL 

77 
63, 6i* 



36,36 



121 
100,00 



STATISTICS FOR TABLE OF COMPLEX FREQ BY XNOTRTS 



. ^^^^ 

CHI -SQUARE 2 6"535 

LIKELIHOOD RATIO CHI-SQUARE 2 6 711 

MANTEL-HAENSZEL CHI-SQUARE 1 5,887 

PHI 0,232 

CONTINGENCY COEFFICIENT n 226 

CRAMER'S V 0,'232 



0,038 
0,035 
0.015 



SAMPLE SIZE = 121 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX FREQ BY XPOSCXN ^^^^^ ^ 



COMPLcX FREQ 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



FREQ <= 1 



FREQ >= 1 



XPOSCXN - NUMBER OF POSITIVE REASONS MARKED FOR 
TEACHING GRAPHING COMPLEX NUMBERS 



TOTAL 



<= 1 P0SI2 <= REAIU <= REAI 

28 I 26 I ""23"! 

23. lU I 21.49 I 19.01 I 

36.36 I 33.77 I 29.87 I 

^^80^00 I 65.00 I 50.00 I 

c J ! ^'♦1 23"! 

^^-57 I 19.01 I 

15.91 31.82 I 52.27 I 

^^20.00 I 35.00 I 50.00 I 

35 uo """iir 

28.93 33.06 38.02 



77 
63.64 



36.36 



121 
100.00 



STATISTICS FOR TABLE OF COMPLEX FREQ BY XPOSCXN 

!iei!!Iie VALUE PROB 

I -SQUARE 

biJi^k^^^^^ ^A^^O CHI -SQUARE 
MANTEL-HAPNSZEL CHI -SQUARE 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



2 
2 



7.779 
8.033 
7.714 
0.25U 
0.2U6 
0.25U 



0.020 
0.018 
0.005 



SAMPLE SIZE := 121 
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CCi^rENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 7 

TABLE OF COMPLEX USEO BY XPOSCXN 



COMPLEX U5ED 

FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



XPOSCXN - NUMBER OF REASONS MARKED FOR 

TEACHING GRAPHING COMPLEX NUMBERS 



USED <= 1 



1 < USED <= 



|<= 1 P0SI2 <= 

I ITIVE REIS <= 
+ 

I 19 I 
I 15.70 I 7, 
I 57.58 I 27, 
I 5U.29 I 22, 
— + — + 

2 I HI 

I 3.31 I 11. 

I 13.79 I U8, 

I 11. U3 I 35. 



REAIU <= 
3 I SONS 



9 I 
I 

27 I 
5C I 



15 
10 



1U I 

57 I 9 

28 I 37 

00 I 23 



2 USED 



TOTAL 



I 12 I 

I 9.92 I 1U, 

I 20. 3U I 28. 

I i4.29 I U2. 

-+ + — 

33 

28.93 33, 



17 I 

05 I 2U 

81 I 50 

50 I 65 



UO 
06 



38 



REAI 
I 

-+ 

5 I 
13 I 
15 I 
87 I 
-+ 

11 I 
09 I 
93 I 
91 I 
+ 

30 I 
79 I 
85 I 
22 I 
-+ 

U6 
02 



TOTAL 

33 
27.27 



29 
23.97 



59 
U8.76 



121 
100.00 



STATISTICS FOK TABLE OF COMPLEX USED BY XPOSCXN 



STATISTIC 


DF 


VALUE 


PROS 


CHI -SQUARE 




22.9U5 


0.000 


LIKELIHOOD RATIO CHI -SQUARE 


U 


22.UU9 


0.000 


MANTEL-HAENSZEL CHI -SQUARE 


1 


15.2U6 


0.000 


PHI 




0.U35 




CONTINGENCY COEFFICIENT 




0.399 




CRAMER'S V 




0.308 





^MPLE SIZE = 



121 
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CONTENT REPRESEN1ATI0N: TECHNICAL APPENDIX 
TABLE OF COMPLEX FREQ BY XNOTCXN ^^^^^ ^ 



COMPLEX FREQ 



FREQUENCY 
PERCENT 
ROW POT 
COL PCT 



FREQ <= 1 



FREQ >= 1 



XNOTCXN - NUMBER OF REASONS NOT MARKED FOR 
TEACHING GRAPHING COMPLEX NUMBERS 



<« 5 
SONS 



REAI6 
NOTfS 



REAI8 <= 
7 I SONS 



23 
19.01 
29.87 
50.00 



35 
28.93 
U5.U5 
71. U3 



19 
15.70 
2U.68 
73.08 



23 
19.01 
52.27 
50.00 



1U 
11.57 
31.82 
28.57 



7 

5.79 
15.91 
26.92 



TOTAL 



U6 
38.02 



U9 
^0.50 



REAl 
I 

---+ 

I 
I 
I 
I 

•+ 
I 
I 
I 
I 

+ 



21 



26 
.49 



TOTAL 



63 



77 
.6U 



36.36 



121 
100.00 



STATISTICS FOR TABLE OF COMPLEX FREQ BY XNOTCXN 



STATISTIC 



DF 


VALUE 


PROB 


2 


5.98U 


0.050 


2 


5.937 


0.051 


1 


5.766 


0.016 




0.222 






0.217 






0.222 





CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL -♦:^ENSZEL CHI-SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



SAMPLE SIZE = 121 



CONTENT REPRESENTA^-ION; TECHNICAL APPENDIX 

TABLE 9 

TABLE OF COMPLEX USED BY XNOTCXN 



COMPLEX USID 



XNOTCXN - NUMBER OF REASONS NOT MARKED FOR 



FREQUENCY 
















PERCENT 
















ROW PCT 
















COL PCT 


l<= 5 


REAI6 


<= 


REA 


8 <= 


REAl 






ISONS 


NOTI 


<= 


7 


SONS 




TOTAL 


USED <= 1 




5 . 




16 




12 1 


33 




1 U 


.13 1 


13 


.22 


9 


.92 i 


27.27 




1 15 


.15 1 


U8 


.U8 


36 


.36 1 






1 10 


.87 1 


32 


.65 


l\6 


.15 1 




1 < USED <= 2 




11 1 




15 




3 1 


29 




1 9 


.09 1 


12 


.UO 


2 


U8 1 


23.97 




1 37 


.93 1 


51 


.72 


10 


3U 1 






1 23 


.91 1 


30 


.61 


11 


5U 1 




2 < USED 




30 1 




18 




1' 1 


59 




1 2U 


79 1 


U 


88 1 


9 


09 1 


U8.76 




1 50 


85 1 


30 


51 1 


18 


64 1 






1 65 


22 1 


36 


73 1 


^2, 


31 1 




TOTAL 




46 




U9 




26 


121 




38 


02 


W 


50 


21. 


U9 


100.00 



STATISTICS FOR TABLE OF COMPLEX USED BY XNOTCXN 



STATISTIC 



DF 


VALUE 


PROS 




15.266 


O.OOU 




16.226 


0.003 


1 


11.136 


0.001 




0.355 






0.335 






0.251 





CHI -SQUARE 

LIKELIHCOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CH I -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



SAMPLF SIZE = 



121 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX USED BY XPOSPOL '^^^'^ 



COMPLEX USED 

FRfTQUENCY 
PERCENT 
ROW PCT 
COL PCT 



USED <= 1 



1 < 'iSED <= 2 



2 < USED 



XPOSPOL - NUMBER OF POSITIVE REASONS MARKED FOR 

TEACHING COMPLEX NUMBERS ON POLAR COORDS 



<= 1 P0SI2 <= REAIU <= REAl 
ITIVE REIS <= 3 ISONS | 

18.18 I k.96 \ U.13 
66.67 I 18.18 I 15.15 
aU.90 I 17.65 I 13,16 



8 

6.61 
27.59 
16.33 



I 



11 
9.09 
37.93 
32.35 



10 
,8.26 
34.48 
26.32 



19 
15.70 
32.20 
38.78 



17 
14.05 
28.81 
50.00 



23 
19.01 
38.98 
60.53 



TOTAL 



40 



49 

,50 



34 
28.10 



38 
31.40 



TOTAL 

33 
27.27 



23 



48 



29 

,97 



39 
,76 



121 
100.00 



STATISTICS FOR TABLE OF COMPLEX USED BY XPOSPOL 



STATISTIC 



CHi -SQUARE 

LIKELIHOOD RATIO CHI-SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



OF VALUE PROB 



4 13.882 0.008 
4 13.840 0.008 
1 8.814 0.003 

0.339 

0.321 

0.240 



SAMPLE SIZE = 121 



253 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 
lABLE OF COMPLEX USED BY XNEGPOL ^^^^^ 



COMPLEX USED 

FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



USED <= 1 



1 < USED <= 2 



2 < USED 



XNEGPOL - NUMBER OF NEGATIVE REASONS MARKED FOR NOT 
TEACH I ^'C COMPLEX NUMBERS ON POLAR COORDS 



0 NEGATI 10 
VE REASOIT 



TOTAL 



19 I 
15.70 I 
57. 5d I 
20.00 I 



27 I 

22.31 I 

93.10 I 

28.42 I 



U9 I 

U0.50 I 

83.05 I 

51.58 I 



95 
78.51 



< NEGAI 
IVE REAl 
+ 

1U 
11.57 
U2.U2 
53.85 



2 

1.65 
6.90 I 
7.69 I 



26 
21. U9 



10 I 

8.26 I 

16.95 I 

38. U6 I 



TOTAL 

33 
27.27 



29 
2^3.97 



59 
U8.76 



121 
100.00 



STATISTICS FOR TABLE OF COMP4-EX USED BY XNEGPOL 



STATISTIC 



DF VALUE PROB 



CHI -SQUARE 

LIKELIHOOD RATIO CHI-SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



12.95U 
12.682 
6.252 
0.327 
0.311 
0.327 



0.002 
0.002 
0.012 



SAMPLE SIZE = 121 



ERIC 



i> 



254 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 19 

TABLE OF COMPLEX USED BY XNEGDEM 



COMPLEX USED 

FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



XNEGDEM 



USED <= 1 



1 < USED <= 



2 < USED 



NUMBER OF NEGATIVE REASONS MARKED FOR NOT 
TEACHING DEMOIV'RE'S THEOREM ^ 



TOTAL 



0 NEGATI 10 < NEGAI 

VE REASOITIVE REA| TOTAL 

+ + 

19 I lU I 

15.70 I 11. 'J7 I 

57.58 I 42. U2 I 

20.88 I U6.67 | 

26" I ri 

21. U9 I 2.U8 I 
89.66 I 10. 3U I 
28.57 I 10.00 I 

U6 I "13" I 
38.02 I 10. 7U I 
77.97 I 22.03 | 
50.55 I U3.33 | 
+ + 

91 30 121 

75.21 2U.79 100.00 



33 
27.27 



29 
23.97 



59 
U8.76 



STATISTICS FOR TABLE OF COMPLEX USED BY XNEGDEM 



STATISTIC 



DF 


VALUE 


PR OB 


2 


8.989 


0.011 


2 


9.030 


0.011 


1 


3.413 


0.065 




0.273 




0.263 






0.273 





CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



SAMPLE SIZE = 121 



ERIC 



2^ 



^0 



255 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX USED BY XNOTDEM ^^^^ 



COMPLEX USED 

FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



USED <= 1 



1 < USED <= 2 



2 < USED 



XNOTDEM - NUMBER OF REASONS NOT MARKED FOR 
TEACHING DEMOIVRE'S THEOREM 



<= 5 REAI6 
SONS NOT IS 



4 I 
3.31 I 
12.12 I 
10.00 I 



11 I 

9.09 I 

i7.93 ! 

27.50 I 



<= REAI8 <= REAI 

<= 7 I SONS I 

15 I 

12,U0 I 11.57 I 

U5.U5 I I 

31.2r> I U2.U2 I 

iu I 

11.57 I 3,31 

U8.28 I 13.79 

29.17 I 12.12 



25 I 

20.66 I 

U2.37 I 

62.50 I 



TOTAL 



HO 
33.06 



19 I 15 
15,70 I 12, UO 
32.20 I 25, U2 
39.58 I U5.U5 I 
+ + 

'♦e 33 

39.67 27,27 



TOTAL 

33 
27.27 



29 
23,97 



59 
48.76 



121 
100.00 



STATISTICS FOR TABLE OF COMPLEX USED BY XNOTDEM 



STATISTIC 



DF 


VALUE 


PROB 




12,565 


0 01U 




13,881 


0.008 


1 


7.881 


0.005 




0.322 




0.307 






0.228 





CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



SAMPLE SIZE = 121 



c7l 
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CONTEKT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX FREQ BY XT0TP03 ^^^^ 

COMPLEX FREQ XTOTPOS - TOTAL NUMBER OF POSITIVE REASONS MARKS- D 

REQUENCY | 
PERCENT I 
ROW PCT I 

COL PCT |<a 7 P0SI8 <- REAMU <= RE| 

IITIVE REIS <= 1U lASONS | TOTAL 



FREQ <= 1 1 
1 
1 
1 


33 1 
27.27 1 
U2.B6 1 
80. U9 1 


22 1 
18.18 1 
28.57 1 
56. U1 1 


22 
16.18 
28.57 
53.66 


1 
1 
1 
1 


77 
63. 6U 


FREQ 1 1 
1 
1 
1 


8 1 
6.61 1 
18.18 1 
19.51 1 


17 1 
1U.05 1 
3d.6<» 1 
U3.59 1 


19 
15.70 
U3.18 
U6.3i* 


1 
1 
1 

1 


36.36 


TOTAL 


U1 
33.88 


39 
32.23 


U1 
33.88 




121 
100.00 



STATISTICS FOR TABLE OF COMPLEX FREQ BY XTOTPOS 



STATISTIC 



DF 



VALUE 



PROB 




2 
2 



7.675 
8.113 
6.533 
0.252 
0.24U 
0.252 



0.022 
0.017 
0.011 



SAMPLE SIZE = 121 
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CONTENT REPRESFNTATION: TECHNICAL PPENDIX 

TABLE 15 

TABLE QP COMPLEX USED BY XTOTPOS 

COMPLEX USED XTOTPOS • TOTAL NUMBEK OF POSITIVE RtASONS MARKED 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 


l<« 7 P0SI8 <= REAIIU <= REI 
1 ITIVE REIS <= lU lASONS | 


TOTAL 


USED <= 1 


1 22 1 
1 18.18 1 
1 66.67 1 
1 53.66 1 


6 

a. 96 

18.18 
15.38 


1 5 
1 U.13 
1 15.15 
1 12.20 


1 
I 
1 
1 


33 
27.27 


1 < USED <a 2 


1 5 1 
1 a. 13 1 
1 17.24 1 
1 12.20 1 


13 
10.74 

aa.83 

33.33 


1 11 
1 9.09 
1 37.93 
1 26.83 


1 

1 
1 


29 
23.97 


2 < USED 


1 lU 1 
1 11.57 1 
1 23.73 1 
1 3U.15 1 


20 
16.53 
33.90 
51 28 


1 25 
1 20.66 
1 U2.d/ 
1 60.93 


1 
1 
1 
1 


59 
U8.76 


TOTAL 


U1 
33.88 


39 
32.23 


U1 
33.88 


V 


121 
100.00 



STATISTICS FOR TABLE OF COMPLEX USED BY XTOTPOS 



STATISTIC 



CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENS2EL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



DF VALUE PROB 



k 22.9U5 0.000 
k 22.380 0.000 
1 13.615 0.000 

0.U35 

0.399 

).308 



SAMPLE SIZE = 121 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX USED BY XTOTflEG ^^^^^ 

XTOTNEG - TOTAL NUMBER OP NEGATIVE REASONS MARKED 
FREQUENCY I "akklu 

PERCENT I 
ROW PCT I 

COL PCT 10 NEGATIIO < NEGAI 
jVE REASOITIVE REA| TOTAL 

USED <= r |- 

11.57 I 15.70 27.27 
f2.U2 I 57.58 I 
I 16.87 I 50.00 I 

1 < uiF:D";:"rr rl 29 

21.U9 I 2.U8 I 23. S7 
89.66 I 10.3U ^ 
j 31.33 I 7.89 I 

' ^^'^^ ' 

TOTAL o7"'*' rr""** 

00 38 121 

68.60 31.40 100.00 

STATISTICS FOR TABLE OF COMPLEX USED BY XTOTNEG 



STATISTIC 



DF VALUE 



PROB 



CHI-SQUARt 

LIKELIHOOD RATIO CHI-SQUARE 
MA.VITEL.KAEWSZEL CHI-SQUA^E 

KK^^ COEFFICIENT 



16.966 
17.356 
6.6U1 
0.37U 
0.351 
0.37U 



0.000 
0.000 
0.010 



SAMPLE SIZE = 121 




CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 17 

TABLE OF COMPLEX USED BY XTOTNCT 

COMPLEX US£D XTOTNOT - TOTAL NUMBER OF REASONS NOT MARKED 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 


l<= 20 RE 121= < RE 129 <= RE I 
lASONS NO IAS <= 28IAS0NS | 


TOTAL 


USED <= 1 


1 5 1 
1 a.13 1 
1 15.15 1 
1 12.82 1 


15 1 
12.40 1 
45.45 1 
26.32 i 


13 
10.74 
39.39 
5?>.00 


1 
1 
1 
1 


33 
27.27 


1 < USED <= 2 


1 10 1 
1 8.26 1 
1 34.^3 1 
1 23.6k 1 


16 1 
13.22 1 
55.17 1 
28.07 1 


3 

2.48 
10.34 
12.00 


1 

! 
1 
1 


29 
23.97 


2 < USED 


1 24 1 
1 19.83 1 
1 40.68 1 
1 61.54 1 


26 1 
21.49 1 
44.07 1 
45.61 1 


9 

7.44 
15.25 
36.00 


1 
1 
1 
1 


59 
48.76 



TOTAL 39 57 25 121 

32.23 47.11 20.66 100.00 



STATISTICS FOR TABLE OF COMPLEX USED BY XTOTNOT 



STATISTIC 



CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENr 
CRAMER'S V 



DF VALUE PROB 



4 12.807 0.012 
4 12.658 0.013 
1 7.888 0.005 

0.325 

0.309 

0.230 



SAMPLE SIZE = 121 



CONTENT REPRESENTATION: TECMNICAL APPENDIX 
TABLE OF COMPLEX USED BY APRCOS ^^^^^ ^® 
COMPLEX USED 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



USED <= 1 



1 < USED <= 2 



APRCOS (217. PRESENTED R COSINE ThETA) 



2 < USED 



TOTAL 



I 
I 

IGAVE l-'OR I STATED 

|MAL PROOl DERIV 

j ^ 2 I 6 

1.71 } 5.13 

6.90 I 20.69 

I '*.3S I 27.27 

j t5 I 5 

12.82 I U.27 

51.72 I 17.2* 

I J2.61 I 22.73 

! 29 I 1i' 

2/*. 79 I 9.U0 

'♦9.15 I 18. 6U 

I 63. OU I 50.0'" 

+ + 

k6 22 

39.32 18.80 



W I STATED 
10 DERIV 

I 



5 

I U.27 
I 17.2a 
I 31.25 

I U 

I 3.^2 

I 13.79 

I 25.00 
+ 

I 7 
I 5.98 
I 11.86 
I U3.75 
•f--.-*.... 

16 
13.68 



NIDID NOT I 
I COVER I 

j 16 I 
i 13.68 I 
! 55.17 I 
I U8.U8 I 
-+ + 

I 5 I 
! U.27 I 
I 17.24 I 
I 15.15 I 
•+ + 

j :2 i 

I 10.26 I 

I 20.34 I 

} 36.36 I 
+ ^ 

33 
i8.21 



TOTAL 

29 
24.79 



29 
24.79 



59 
50.43 



117 
100.00 



FREQUCNCY MISSING = 4 



STATISTICS FOR TABLE OF COMPLEX USED BY APRCOS 



STATISTIC 



DF VALUE PROB 



CHJ -SQUARE . 
LIKELIHOOD RATIO CHI-SQUARE 6 l^'tnl 
HANTEL^HAENSZEL CHI-sK' ? ?J:SSf g-g^l 

0.299 

EFFECTIVE SAMPLE SIZE ^ 117 
FREQUENCY MISSIHG s U 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE OF LOG FREQ BY ATTGRLN ^^^^ 

LOG FREQ ATTGRLN{U3.TAUGHT NEW GRAPHING LOG FUNCT) 

FREQUENCY | 
PERCENT I 
ROW PCT I 

COL PCT [COVERED INOT COVE| 

I I RED I TOTAL 

FREQ <= 1 I 59 I ""2a" I 83 

I 51.30 I 20.87 I 72. ' 

I 71.08 I 28.92 I 

I 67.05 I 88.89 I 

FREQ >= 1 I 29 I Vl 32 

I 25.22 I 2.61 I 27.83 

I 90.63 I 9.38 I 

I 32.95 I 11.11 I 

TOTA^ 88 "27" 115 

76.52 23.48 100.00 

FREQUENCY MISSING = 7 



STAT ; ST ICS FOR TABLE OF LOG FREQ BY ATTGRLN 



STATISTIC 
CHI -SQUARE 

LIK.^LIHOOO RATIO CHI -SQUARE 
CONTINUITY ADJ. CHI-SQUARE 
MAKTEL-HAENSZEL CHI -SQUARE 
FISHER'S EXACT TEST (1-TAIL) 

PH. 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



EFFECTIVE SAMPLE SIZE = 115 
FREQUENCY MISSING = 1 



DF 


VALUE 


PROB 


1 


U.909 


0.027 


1 


5.60U 


0.018 


1 


3.881 


0.049 


1 


4.866 


0.027 






0.020 






0.028 




-0.207 






0.202 






-0.207 





277 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF LOG USED BY ATTGRLN ^^^^^ 
LOG USED 



ATTGRLN(U3, TAUGHT NEW GRAPHING LOG FJNCT) 



FREQUENCY 






PERCENT 






ROW PCI 






COL PCT 


[COVERED 


INOT COVEI 






IRED 1 


USED <= 1 


1 20 


1 1U 1 




1 17,39 


1 12.17 1 




1 38.82 


1 U1.18 1 




1 22.73 


1 31.83 1 


1 < USED <= 2 


1 UU 


1 11 1 




1 38.26 


1 9.37 1 




1 80.00 


t 20 00 1 




1 30.00 


1 ^0.74 1 


2 < USED 


1 2U 


1 2 1 




1 20.87 


1 1.7U 1 




1 92. 3 1 


1 7.69 1 




1 27.27 


i 7.U1 1 



TOTAL 



88 

76.32 



27 
23. U8 



TOTAL 
29.57 



35 
U7.83 



26 
22.61 



113 
100.00 



FREQUENCY MISSING = 1 



STATISTICS FOR TABLE OF LOG "^.?D BY ATTGRLN 
S' ATISTIC 



CHI •SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



OF 


VALUE 


Pr.OB 


2 


9.90U 


0.007 


2 


10.132 


0.006 


1 


9.510 


0.002 




0.293 






0.282 






0.293 





EFFECTIVE SAMPLE SIZE 
FREQUENCY MISSING > 1 



115 



ERIC 



27 s 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF LOG FREQ BY XPOSGRL ^^^^^ 

LOG FREQ XPOSGRL - NUMBER OF POSITIVE REASONS MARKED FC;^ 
FREQUENCY | TEACHING GRAPHING LOG FUNCTIONS 

PERCENT I 

ROW PCT I 

COL PCT |<= 1 P0SI2 <= REAI4 <= REA| 

( ITIVE REIS <= 3 ISONS | TOTAL 

FREl <^ II 28 I 32"T 8U 

2U.1U 1 20.69 I 27.59 I 72.41 

I 33.33 I 28.57 I 38.10 I 

I 87.50 I 60.00 I 72.73 | 

FREQ >= 1 I I I 15"! ,2*1 32 

I 3.45 I 13.79 I 10.34 I 27.59 

12.50 I 50.00 I 37.50 | 

I 12.50 I 40.00 I 27.27 | 

TOTAL 32 40 ''44" 116 

27.59 34.48 37.93 100.00 



STATISTIC FOR TABLE OF LOG FREQ BY XPOSGRL 



STATISTIC 



DF 


VALUE 


P?OB 


2 


6.734 


C.G34 


2 


7.131 


0.028 


1 


1.460 


0.227 




0.241 






0.234 






0.241 





CHI -SQUARE 

LIKEL'^OOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAf^ER'S V 



SAMPLE SIZE = 116 



ERIC 
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CONTENT REPRESENTATION: TECHNICM. APPENDIX 
TABLE OF LOG FREQ BY XNEGGRL ^'^^'"^ 

LOG FREQ XNEGGRL - NUMBER OF NEGATIVE REASONS MARKED FOR NOT 
FREQUENCY | TEACHING GRAPHING LOG FUNCtTsSs 

PERCENT I 

ROW PCT I 

COL PCT 10 NEGATIIO < NEGA| 
jVE REASOITIVE '>'^A| TOTAL 

FREQ <= 1 I 6l'* cs'* 8U 
I 52.59 I 19.83 I 72. Si 
72.62 I 27.38 | 
l_^66.30 I 95.83 | 

FREQ >= 1 I 11'* "I 32 

1 26.72 I 0.86 I 27.59 
96.88 I 3.13 
j_ 33.70 I U.17 I 

TOtZ:" *""""?r*'"""2r* 116 
79.31 20.69 100.00 

STATISTICS FOR TABLE OF LOG FREy CY XNEGGRL 

STATISTIC np \yAi iir 

. — "'^ VALUE PROB 

CK I -SQUARE T 



LIKELIHOOD RATIO CHI -SQUARE 1 lo'lsl S'HS^ 

CONTINUITY ADJ. CHI-SQUARE 1 Mil 2-°°^ 
MANTEL-HAENS2EL CH -SQUARE 

FISHER'S EXACT TEST (1-TAIL) ^'^^^ g-ggj 

PHI (2-TAIL) oioOU 

^C^U^nV COEFFICIENT -gili? 



CRAMER'S V 
SAMPLE SIZE = 116 



•0.268 



23 j 
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CONTENT REPRE 


3ENTATI0N: TECHNICAL 


APPENDIX 


TABLE OF LOG 






TABLE 23 


USED BY XNEGGRL 


LOG USED 


XNEGGRL 


- TOTAL 


NUMBER OF REASONS MARKED FOR NOT 


FREQUENCY 




TEACHING GRAPHING LOG FUNCTIONS 


PERCENT 








ROW PCT 








COL PCT 


iO NEGATI 10 


< NEGAI 






IVE REASOiTIVE REA| 


TOTAL 


USED <= 1 


1 22 1 


12 1 


34 




1 18.97 1 


10.34 1 


29.31 




1 64.71 1 


35.29 1 






1 23.91 1 


50.00 1 




1 < USED <= 2 


1 45 1 


11 1 


56 




1 38.79 1 


9.48 1 


48.28 




1 80.36 1 


19.64 i 






1 48.91 1 


45.83 1 




2 < USED 


1 25 1 


1 1 


26 




1 21.55 1 


0.86 1 


22.41 




1 96.15 1 


3.85 1 






1 27.17 1 


4.17 1 












TOTAL 


92 


24 


116 




79.31 


20.69 


100.00 



STATISTICS FOR TABLE OF LOG USED BY XNEGGRL 



STATISTIC 



DF 


VALUE 


PROB 


2 


8.952 


0.011 


2 


10.165 


0.006 


1 


8.875 


0.003 




0.278 






0.268 






0.278 





CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF LOG USED BY XTOTNEG ^'^^^^ 



LOG USED 


XTOTNEG 


- TOTAL 


NUMBER OF 


FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 


f 
i 
1 

10 NEGATIIO < NEGAI 
IVE REASOITIVE REA| 


TOTAL 


USED <= 1 


1 18 1 
1 15.52 1 
1 52.94 1 
1 25.00 1 


16 1 
13.79 1 
47.06 1 
36.36 1 


34 
29.31 


1 < USED <= 2 


! 31 1 
1 26.72 1 
1 55.36 1 
1 43.06 1 


25 1 
21.55 » 
44.64 1 
56.82 1 


56 
48.28 


2 < USED 


1 23 1 
1 19.83 1 
1 88.46 1 
! 31.94 1 


3 1 
2.59 1 
11.54 1 
6.82 1 


26 
22.41 


TOTAL 


72 
62.07 


44 
37.93 


116 
100.00 



STATISTICS FOR TABLE OF LOG USED BY XTOTNEC 



STATISTIC 



DF 



/ALUE 



PR OB 



CHI -SQUARE 

bi?.^!:.'"^^^ ^ATIO CHI -SQUARE 
MANTa-HAENSZEL CHI-SQUARE 

£2!l^r^;^NCY COEFFICIENT 
CRAMER'S V 



2 
2 
1 



9.967 
1 1 . 383 
7.034 
0.293 
0.281 
0.293 



0.007 
0.003 
0.008 



SAMPLE SIZE = T16 




CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 25 

TABLE OF LOG FREQ BY AEXLOGB 

LOG FREQ AEXLOGB (203. EXPECT LOG 8ASE B OF X) 

FREQUENCY | 
PERCENT I 
ROW PCT I 

COL PCT IPROVE ANIDERIVE AlREC. lL AIWHEN GIVINOT DISCI 

ID APPLY IND APPLYING APPLYI EN>APPLY| USSED | TOTAL 

+ 

I 75 
I 70.75 
I 
I 

■+ 

I 31 
I 29.25 
I 
I 

+ 



FREQ <= 1 1 
1 
1 
1 


2 1 
1.89 1 
2.67 1 
25.00 1 


15 1 
14.15 1 
ZO.OO 1 
78.95 1 


3U 1 
32.08 1 
45.33 1 
69.39 1 


8 1 
7.55 1 
10.67 1 
66.67 1 


16 
15.09 
21.33 
88.89 


FREQ >= 1 1 
1 
1 
1 


6 1 
5.66 1 
19.35 1 
75.00 1 


U 1 
3.77 1 
12.90 1 
21.05 1 


15 1 
14.15 1 
U8.39 1 
30.61 1 


4 1 
3.77 1 
12.90 1 
33.33 1 


2 

1*89 
6.i*5 
11.11 


TOTAL 


8 

7.55 


19 
17.92 


49 
46.23 


12 

n,3? 


18 
16.98 



106 
100.00 



FREQUENCY MISSING = 10 



STATISTICS FOR TABLE OF LOG FREQ BY AEXLOGB 



STATISTIC 



DF 


VALUE 


PROB 


4 


11.712 


0.020 


4 


11.366 


0.023 


1 


4.998 


0.025 




0.332 






0.315 






0.332 





CHI -SQUARE 

LIKELIHOOD RATIO CH I -SQUARE 
MANTEL-HAENSZEL CH I -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



EFFECTIVE SAMPLE SIZE = 106 
FREQUENCY MISSING = 10 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX USED BY TQUEST ^^^^^ 26 
COMPLEX USED 



TQUEST - DIFFERENT STUDENTS QUESTIONED 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 


1 1 
1 <= 25X IN 
11 


25X < Nl 
<= 50% IN 
21 


50% < Nl 
<= 75% 1 
31 


1 

N > 75%l 
41 


TOTAL 


USED <= 1 


14 1 
12.28 1 
U3.75 1 
U5.16 1 


5 1 
U.39 1 
15.63 1 
16.67 1 


10 1 
8.77 1 
31.25 1 
UO.OO 1 


3 1 
2.63 1 
9.38 1 
10.71 1 


32 
28.07 


1 < USED <= 2 


6 1 
5.26 1 
25.00 1 
19.35 1 


U 1 
3.51 1 
16.67 1 
13.33 1 


8 1 
7.02 1 
33.33 1 
32.00 1 


6 1 
5.26 1 
25.00 1 
21. U3 1 


2U 
21.05 


2 < USED 


11 1 
9.65 1 
18.97 1 
35. •'♦8 1 


21 1 
18. U2 1 
36.21 1 
70.00 1 


7 1 
6.1U 1 
12.07 1 
28.00 1 


19 1 
16.67 1 
32.76 1 
67.86 1 


58 
50.88 


TOTAL 


31 

27.19 


30 
26.32 


25 
21.93 


28 
2U.56 


11U 
100.00 



FREQUENCY MISSING 



STATISTICS FOR TABLE OF COMPLEX USED BY TQUEST 



STATISTIC 



OF 



VALUE 



PROS 



OH I -SQUARE 6 
LIKELIHOOD RATIO CHI -SQUARE 6 
MANT£c.-HAENSZEL CHI -SQUARE 1 
PHi 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



18.963 
19.712 
3.90U 
0.U08 
0.378 
0.288 



O.OO:; 
0.003 
0.0U8 



EFFECTIVE SAMPLE SIZE = 114 
FREQUENCY MISSING » 7 



234 



CONTENT RC^^'-vtNTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX USED BY TEXPLNT ^^^^^ 

COMPLEX USED TEXPLNT - MINUTES EXPLAINING NEW MATERIAL 

FREQUENCY | TYPICAL WEEK 

PERCENT I 

ROW PCT I 

COL PCT IMINS < moo <= MM50 <= M| 
100 I INS < 151 INS I TOTAL 

USED <= 1 I n'l 16"! ri 32 
I 9.65 I lU.OU I a. 39 I 28.07 
3U.38 I 50.00 I 15.63 I 
1^^33.33 I 39.02 I 12.50 I 

1 < USED <= 2 I 'Tl T] ri 2i* 

I 7.89 I 5.26 I 7.89 I 21.05 

37.50 I 25.00 I 37.50 I 
I 27.27 I 1U.63 I 22.50 I 

2 < USED I 13*1 19"i 26"! 58 

I 11. UO I 16.67 I 22.81 I 50.88 

22. U1 I 32.76 I i|U.83 I 
I 39.39 I U6.3U I 65.00 I 

TOTAL 33 m " Uo" 11U 

28.95 35.96 35.09 100.00 

FREQUENCY MISSING = 7 



STATISTICS FOR TABLE OF COMPLEX USED BY TEXPLNT 



STATISTIC 



DF 



VALUE 



PR OB 



CHI -SQUARE 

LIKELIHOOD PATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



9.571 
10.266 
U.690 
0.290 
0.278 
0.205 



O.OUB 
0.036 
0.030 



EFFECTIVE SAMPLE SIZE 11U 
FREQUENCY HISSING » 7 
SAMPLE SIZE = 116 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE OF LOG FREQ BY TNONEW ^^^""^ ^® 

LOG FREQ TNONEW - NOT AHY DISCOVERIES FOR A LONG TIME 



FREQUENCY 
PERCENT 
ROW pni 
COL PCT 


> 1 1 
1 STRONG 1 1 UN- 
lOlSAGREElDISAGREEi OECIOEDI 
1 11 21 31 


TOTAL 


TREQ <= 1 


25 1 
> 23,15 1 
1 32,47 1 
1 89,29 1 


U5 I 
U1,67 1 
58, 1 
63,38 1 


7 1 
6,U8 1 
9,09 1 
77,78 1 


77 
71,30 


FREQ >= 1 


1 3 1 
1 2,78 1 
I 9,68 1 
1 10,71 1 


26 1 
2U,07 1 
83,87 1 
36,62 1 


2 1 
1,85 1 
6,U5 1 
22,22 1 


31 
28,70 


TOTAL 


28 
25,93 


71 
65, 7U 


9 

8,33 


108 
100,00 



FREQUENCY MISSING = 8 



STATISTICS FOB TABLE OF LOG FREQ BY TNONEW 

DF VALUE PROB 

CH I -SQUARE 2 ^^tat*^' 

LIKELIHOOD RATIO CHI-SQUARE I 7 6o5 

MANTEL-HAENSZEL CHI-SQUARE 1 lltll 

CONTINGENCY COEFFICIENT n'lu] 

CRAMER'S V 0,251 



0,03U 
0,022 
0,091 



EFFECTIVF SAMPLE SIZE = 108 
FREQUENCV MISSING = 8 



ERIC 



2S6 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 29 

TABLE OF LOG USED BY ROBJINT 

LOG USED R0BJINT(3. OBJECTIVE.. INTEREST IN MATHEMATICS) 



FREQUENCY 
PERCENT 
ROW PCT 
COL POT 


IRELATIVEI EQUAL EM | RELATIVE I 
ILY MORE IPHASIS |LY LESS I 


TOTAL 


USED <a 1 


1 12 
1 10.53 
1 37.50 
1 31.58 


t 16 
1 1U.0U 
1 50.00 
i 27.59 


1 U 
1 3.51 
1 12.50 
1 22.22 


1 
1 
1 
1 


32 
28.07 


1 < USED <= 2 


1 15 
1 13.16 
1 2f.79 
1 l\ .i;7 


1 35 
1 30.70 
1 62.50 
1 60. 3U 


1 6 
1 5.26 
1 10.71 
1 33.33 


1 
1 
1 
1 


56 
49.12 


2 < USED 


i 11 
1 9.65 
1 42.31 
i 28.95 


1 7 
1 6.14 
1 26.92 
1 12.07 


1 8 
1 7.02 
1 30.77 
1 


1 
1 
1 
1 


26 
22.81 


TOTAL 


38 
33.33 


58 
50.88 


18 
1'?.79 




11U 
100.00 



FREQUENCY MI SSI'-* 2 



STATISTICS FOR TABLE OF LCG USLC ?Y ROBJINT 



STATISTIC 



CHI-SQUAHE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CM I -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



DF VALUE PROS 



U 10.76*^ 0.029 

U 10.60 0.031 

1 0.582 0.446 

0.307 

0.29U 

0.217 



EFFECTIVE SAMPLE SIZE » 11U 
FREQUENCY Ml SSI NO » 2 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF LOG USED BY ROBJLIF ^^^^ 

LOG USED R0BJLIF(6. OBJECTIVE. .AWARENESS OF MATH IN LIFE) 

FREQUENCY | 
PERCENT I 
ROW POT I 

COL PCT IRELATIVEIEQUAL EMIRELATIVEI 
ILY MORE jPHASIS |LY LESS j TOTAL 

USED <= 1 I 16*1 10*1 rl 32 

! ^-^^ ' 5.26 I 28.07 
50.00 I 31.25 I 18.75 I 
1^ W.U8 I 16.(5? i 28.57 \ 

1 < USED <= 2 1 12*1 33*1 11*1 56 

10.53 I 28.95 I 9.65 I W.12 
21. U3 I 58.93 I 19. 6U I 
I 36.36 I 55 00 I 52.38 I 

2 < USED I 5*1 17*1 11*1 26 

U.39 I 1U.91 I 3.5^ 5 22.81 
19.23 I 65.38 I 15.36 
I 15.15 I 28.33 I 19.05 I 

TOTAL 33 60 21* 1 .4 

28.95 52.63 18. U2 100.00 

FREQUENCY MISSING =: 2 



STATISTICS FOR TABLE OF LOG USED BY ROBJLIF 



?I-Ii?Ti? 0^ VALUE PROB 

U 11.023 0^026 

U 10.776 0.029 

1 2.601 0.107 



CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



11.023 
10.776 
2.601 
0.311 
0.297 
0.220 



EFFECTIVE SAMPLE SIZE = Hi* 
FREQUENCY MISSING » 2 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF LOG USEO BY ROBJCOM ^^^^ 

LOG USEO R0BJC0M(7. OBJECTIVE.. COMPUTATION SPEEO ACCURACY) 

FREQUENCY j 
PERCENT I 
ROW PCT I 

COL PCT I RELATIVE I EQUAL EM I RELATIVE I 

'^LI!^^ IPHASIS ILY LESS j TOTAL 

USEO <= 1 I 12*1 rl 32 

J 11. UO I 10.53 I 6.1U I 28.07 

W 63 I 37.50 I 21.88 I 

I S3 I 20.34 I 26.92 j 

1 < USEO <= 2 1 14 I lU*! 56 

12.28 I 24.56 I 12.28 j 49.12 
I 25.00 I 50.00 I 25.00 ! 
L.l!?:^® ' '^^-^ ' 53.85 I 

2 < USEO rl 19*1 rl 26 

WA ! iVfl ! '*-39 I 22.81 
7.69 73.08 I 19.23 | 
1^ 6.90 I 32.20 I 19.23 | 

TOTAL 29 59** 26** 114 

25.44 51.75 22.81 100.00 

FREQUENCY MISSING = 2 



STATISTICS FOR TABLE OF LOG USEO BY ROBJCOM 



VALUE PROB 

4 9.974 0.041 

4 10.628 0.031 

1 2.789 0.095 



CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANVEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



9.974 
10.628 
2.789 
0.296 
0.284 
0.209 



EFFECTIVE SAMPLE SIZE 
FREQUENCY MISSING => 2 



11f 



289 



CONTCNT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF COMPLEX USED BY ROB JSC I ^^^^ 
COMPLEX USED 



ROBJSCI(B.OBJECTIVE. .AWARE OF MATH IN SCI 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



USED 'I 



1 < USED <= 2 



2 < USED 



TOTAL 



RELATIVE I EQUAL EMIRELAIIVEI 

LY MORE IPHASIS |LY LESS j 

11 I 19 I ^2*1 

9.17 I 15.83 I 1.67 I 

34.33 I 39.38 | 6.23 I 

28.93 I 30.63 I 10.00 | 

8 1 11 I 10 I 

6.67 I 9.17 I a. 33 I 

27.39 I 37.93 I 34.48 | 

_2K03 I 17.74 I 30.00 I 

19 I 32 I 8*1 

13.83 I 26.67 j 6.67 j 

32.20 I 34.24 j n.^C I 

30.00 I 31.61 I 40.00 I 

33 62 20* 

31.67 31.67 16.67 



TOTAL 

32 
26.67 



29 
24.17 



39 
49.17 



120 
100.00 



FREQUENCY MISSING 1 



STATISTICS FOR TABLE OF COMPLEX USED BY ROBJSCI 



STATISTIC 



DF 


VALUE 


PROB 


4 


9.683 


0.046 


4 


9.146 


0.038 


1 


0.106 


0.744 




0.284 






0.273 






0.201 





CHI -SQUARE 

LIKELIHOOD RATIO CHt -SQUARE 
MANTEL-HAENSZEL CH 3 -SQUARE 
PHI 

CONTINGENCN' COEFFICIENT 
CRAMER'S V 



EFFECTIVE SAMPLE SIZE 
FREQUENCY MISSING 1 



120 



290 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
TABLE OF LOG FREQ BY ROBJSCI ^^^^ 

LO^* FREQ ROBJSCI (6. OBJECTIVE.. AWARE OF MATH IN SCIENCE) 

FREQUENCY | 
PERCENT I 
ROW PCT I 

COL PCT jRELATIVElEQUAL EM|RELATIVE| 

TOTAL 

82 
71.93 



+ 

I 32 
I 26.07 



114 
100.00 



FREQ <= 1 1 
1 
1 
1 


17 1 
14.91 1 
20.73 1 
46.57 1 


52 1 
45.61 1 
63.41 1 
62.54 1 


13 
11.40 
15.65 
61.23 


FREQ >= 1 1 


16 1 


11 1 


3 


1 


15.79 1 


9.65 1 


2.63 


1 


56.25 1 


34.36 1 


9.36 


1 


51.43 1 


17.46 1 


16.75 


TOTAL 


35 
30.70 


63 
55.26 


16 
14.04 



FREQUENCY HISSING s 2 



STATISTICS FOR TABLE OF LOG FREQ BY ROBJSCI 



STATISTIC 
CHI -SQUARE 

LIKELIHOOD RATIO CHI-SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



EFFECTIVE SAMPLE SIZE = 114 
FREQUENCY MISSING = 2 



DF 


VALUE 


PROB 


2 


13.659 


0.001 


2 


13.056 


0.001 


1 


9.591 


0.002 




0.346 




0.327 






0.346 





291 



276 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 3U 

TABLE OF LOG FREQ BY RSISYLG 

LOG FREQ RSISYLGdOB.GOALS SOURCE. .SYLLABUS) 



FREQUENCY 1 
PERCENT 
ROW PCT 

COL PCT ! 


NEVER US lOCCASI ONI FREQUENT! 
ED {ALLY ILY USED j 


TOTAL 


FREQ <« 1 i 


27 
25.23 
36.00 
90.00 


1 37 
1 3U.58 
i U9.33 
1 61.67 


1 n 1 
1 10.28 1 
1 1U.67 1 
1 6U.71 1 


75 
70.09 


FREQ >« 1 i 


3 

2.80 
9.38 

10.00 


t 23 
1 21.50 
1 71.88 
1 38.33 


1 6 1 
1 5.61 1 
1 18.75 1 
1 35.29 1 


32 
29.91 


TOTAL 


30 
28.04 


60 
56.07 


17 
15.89 


107 
100.00 



FREQUENCY HISSING 



STATISTICS FOR TABLE OF LOG FREQ BY RSISYLG 



STATISTIC 


OF 


VALUE 


PROB 


CHI*SqUARE 


2 


7.939 


0.019 


LIKELIHOOD RATIO CHI -SQUARE 


2 


9.095 


0.011 


MANTEL-HAENSZEL CHI -SQUARE 


1 


U.936 


0.026 


PHI 




0.272 


CONTINGENCY COEFFICIENT 




0.263 




CRAMER'S Y 




0.272 





EFFECTIVE SAMPLE SIZE 
FREQUENCY MISSING ^ 9 



107 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 39 

TABLE OF LOG USED BY RSI PROG 



LOG USED 

FREQUENCY 
PERCENT 
ROW PUT 
COL PCT 



USED <= 1 



1 < USED <= 2 



2 < USED 



TOTAL 



RSIPROG(10P JOALS SOURCE.. PROF MEETINGS) 



NEVER US I OCCASION I FREQUENT I 

ED I ALLY ILY USED | TOTAL 

+ + + 

4 1 12 I 14 I 30 

3.77 I 11.32 I 13.21 I 26.30 

13.33 I UO.OO I U6.67 | 

19.05 I 20.69 I 51.85 I 
+ + + 

11 I 35 I 7 1 53 
10.38 I 33.02 I 6.60 I 50.00 
20.75 I 66.04 I 13.21 | 
52.38 I 60.34 I 25.93 I 
+ + + 

6 1 11 I 6 1 23 
5.66 I 10.38 I 5.66 I 21.70 
26.09 I 47.83 I 26.09 I 
28.57 I 18.97 I 22.22 | 
«— +— « — + 

21 58 27 106 

19.81 54.72 25.47 100.00 



FREQUENCY HISSING » 10 



STATISTICS FOR TABLE OF LOG USED BY RSI PROG 
STATISTIC 



CHI-SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
NANTEL^HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



DF 


VALUE 


PROS 


4 


12.169 


0.016 


4 


11.885 


0.018 


1 


3.868 


0.049 




0.339 






0.321 






0.240 





EFFECTIVE SAMPLE SIZE « 106 
FREQUENCY MISSING « 10 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 36 

TABLE OF LOG FREQ BY RSIJRNP 

LOG FREQ RSIJRNP (HE. PRESENTATION SOURCE. .JOURNALS, BOOKS) 

FREQUENCY | 

PERCENT I 

ROW PCT I 

COL PCT I NEVER US I OCCASION I FREQUENT | 

I ED I ALLY ILY USED | TOT^L 
+ + + + 

FREQ <= 1 I 50 I 17 I 9 1 76 

I U6.73 I 13.89 I 8.U1 I 71.03 

I 65.79 I 22.37 | 11. 8U I 

I 83.33 I 38.62 | 30.00 I 
^ + + 

FREQ >= 1 I 10 i 12 I 9 1 31 

i 9.33 I 11.21 I 8.U1 I 28.97 

I 32.26 I 38.71 I 29.03 I 

I 16.67 I U1.38 I 30.00 I 

TOTAL 60 29 18 107 

36.07 27.10 16.82 100.00 



FREQUENCY NISSING = 9 



STATISTICS FOR TABLE OF LOG FREQ BY RSIJRNP 



STATISTIC 



CHI -SQUARE 

LIKELIHOOD RATIO CHI-*SQUARE 
HANTEL-HACNSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFIuiENT 
CRAMER'S V 



EFFECTIVE SAMPLE SIZE = 107 
FREQUENCY MISSING » 9 



DF VALUE PROB 



2 10.U32 0.003 

2 10.U30 0.003 

1 9.761 0.002 

0.313 

0.298 

0.313 



294 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 37 

TABLE OF LOG FREQ BY RSIOTHP 



LOG FREQ 

FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



FREQ <= 1 



FREQ >a 1 



RSI0THP(11C.PRESENTATI0:< SOURCE. .OTHER TEACHERS) 



NEVER US I OCCASION I FREQUENT I 



ED 



I ALLY 



ILY USED I TOTAL 



TOTAL 



U9 I 

U6.23 I 

65.33 I 

79.03 I 

13 I 

12.26 I 

41.94 I 

20.97 I 

62 
58. U9 



22 I 

20.75 I 

29.33 I 

66.67 I 

11 I 

10.38 I 

35.48 I 

33.33 I 

33 
31.13 



3.77 
5.33 
36.36 



7 

6.60 
22.58 

63.64 



75 
70.75 



31 
29.25 



11 
10.38 



106 
100.00 



FREQUENCY MISSING « 10 



STATISTICS FOR TABLE OF LOG FREQ BY RSIOTHP 



STATISTIC 


DF 


VALUE 


PROB 


CHI -SQUARE 


2 


8.607 


0.014 


LIKELIHOOD RATIO CHI -SQUARE 


2 


8.011 


0.018 


HANTEL-HAENSZEL CHi -SQUARE 


1 


7.851 


0.005 


PHI 




0.285 




CONTINGENCY COEFFICIENT 




0.274 




CRAMER'S V 




0.285 





EFFECTIVE SAMPLE SIZE « 106 
FREQUENCY MISSING = 10 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE OF COMPLEX USED BY RSITXTD ^^^^ ^® 

COMPLEX USED RS ITXT0(12A.0R ILL SOURCE. .TEXTBOOK) 

FREQUENCY | 
PERCENT I 
ROW PCT I 

COL PCT I NEVER US I OCCASION I FREQUENT | 
l£0 1 ALLY ILY USED | TOTAL 

USED <= 1 I 'IV \ r| n 30 

I 20.18 j 2.63 I 3.51 I 26.32 

I 76.67 I 10.00 I 13.33 I 

I 38.98 I 8.82 | 19.05 I 

1 < USED <= 2 I 10 I 7*1 10*1 27 

I 8.77 I 6.14 I 8.77 I 23.68 
37.04 I 25.93 I 37.04 

L.i^*^^ ' ^^-^^ ' '♦^•^^ 1 

2 < USED I ^26*1 2M 7*1 57 

I 22.81 I 21.05 I 6.14 | 50 00 
I 45.61 I 42.11 I 12.28 

L.**!!'^^ ' ^^'^^ ' ^^'^^ ' 

TOTAL 59 34 1^ ^^j^ 

51.75 29.82 18.42 100.00 

FREQUENCY MI$3ING = 7 



STATISTICS FOR TABLE OF COMPLEX USED BY RSITXTD 
STATISTIC 



CHI-SQUARIE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI-SQUARF 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



OF 


VALUE 


PROB 


4 


18.784 


0.001 


4 


18.558 


0.001 


1 


1.686 


0.194 




0.406 




0.376 






0.287 





EFFECTIVE SAMPLE SIZE » IIU 
FREQUENCY MISSING » 7 



CONTENT REPRESCNTATtON: TECHNICAL APPENDIX 

TABLE 39 

TABLE OF COMPLEX FREQ BY RSISYLA 



COMPLEX FREQ 



RS I SYL A( 1 3B . APPL I CAT I ONS SOURCE . . SYLLABUS ) 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



FREQ <3 1 



FREQ >» 1 



TOTAL 



NEVER US I OCCASION I FREQUENT I 

CD I ALLY ILY USED I 

29 I 34 I 9 1 

23.44 I 29.82 I 7.89 I 

40.28 I 47.22 I 12.30 I 

80.36 I 39.63 I 42.86 I 

7 1 23 I 12 I 

6.14 I 20.18 I 10.33 I 

16.67 I 34.76 I 28.37 I 

19.44 I 40.33 I 37.14 | 

36 37 21 

31.38 30.00 18.42 



TOTAL 

72 
63.16 



42 
36.84 



114 
100.00 



FREQUENCY MISSING 



STATISTICS FOR TABLE OF COMPLEX FREQ BY RSISYLA 



STATISTIC 


DF 


VALUE 


PROB 


CHI -SQUARE 


2 


8.704 


0.013 


LIKELIHOOD RATIO CHI-SQUARE 


2 


9.017 


0.011 


MANTEL-HAENSZEL CHI^SQUARE 


1 


8.378 


0.003 


PHI 




0.276 




CONTINGENCY COEFFICIENT 




0.266 




CRAMER'S V 




0.276 





EFFECTIVE SAMPLE SIZE » 114 
FREQUENCY MISSING « 7 
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C0H7ENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE kO 

TABLE OF COMPLEX USED BY RSISYLA 



COMPLrJ( USED 



RS I SYLA( 1 3B . AP PL I CAT I ONS SOURCE . . SYLLABUS ) 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



NEVER US I OCCASION I FREQUENT 
I ED I ALLY ILY USED 



USED <M 1 



15 
13.16 
90.00 
41.67 



I 



11 
9.65 
36.67 
19.30 



4 

3.51 
13.33 
19.05 



1 < USED <« 2 


10 1 
8.77 
37.04 
27.78 


14 ! 
12.28 1 
51.85 
24.56 


3 

2.63 
11.11 
14.29 


2 < USED 


11 1 32 
9.65 1 28.07 
19.30 1 56.14 
30.56 1 56.14 


14 
12.28 
24.56 
66;67 


T0>^ 


36 57 
31.58 50.00 


21 
18.42 



TOTAL 

30 
26.32 



27 
23.68 



57 
50<00 



114 
100.00 



FREQUENCY HISSSNG « 7 



STATISTICS FOR TABLE OF COMPLEX USED BY RSISYLA 



STATISTIC 


DF 


VALUE 


PROB 


CHI -SQUARE 


4 


10.087 


U.039 


LIKELIHOOD RATIO CHI -SQUARE 


4 


10.184 


0.037 


HANTEL-HAENSZEL CHI -SQUARE 


1 


7.849 


0.005 


PHI 




0.297 




CONTINGENCY COEFFICIENT 




0.285 




CRAMER'S V 




0.210 





EFFECTIVE SAMPLE SIZE « 114 
FREQUENCY MISSING « ^ 



29 S 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE ill 

TABLE OF LOO FREQ BY RSI TXTA 

LOG FREQ R$ITXTA(13A.APPLICATI0NS SOURCE • EXTBOOK ) 

FREQUENCY : 
P*nicENT I 
ROW PCT j 

COL PCT NEVER US jOCCAS I ON | FREQUENT! 

lEO lALLY ILY USED | TOTAL 



FREQ <» 1 


36 
33.611 

81.82 


29 I 
27.10 
38.67 
56.86 


10 1 75 
9.35 1 70.09 
13.33 1 
83.33 1 


FREQ >» 1 


8 1 22 ! 2 1 32 
7.il8 1 20.56 1 1.87 29-91 
25.00 1 68.75 ! 6.25 1 
18.18 1 Il3.1l| 1 16.67 1 


TOTAL 


111.12 


51 
117.66 


12 107 
11.21 100.00 



FREQUENCY MISSING « 9 



STATISTICS FOR TABLE OF LOG FREQ BY RSITXTA 



STATISTIC 



CHI«*54UARE 

LIKELIHOOD RATIO CHI -SQUARE 
HANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



DF VALUE i>R0B 



2 8.1il3 0.017 

2 8.280 0.016 

1 1.297 0.2S^5 

0.276 

0.266 

0.276 



EFFECTIVE SAMPLE SIZE « 107 
FREQUENCY MISSING « 9 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 42 

TABLE OF LOG USED BY RCRPWHL 
LOG USED 



RGRPWHL(26.WH0LE CLASS WORKING AS A SINGLE GROUP) 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



USED <= 1 



1 < USED <3 2 



2 < USED 



X < 60 



TOTAL 



160 <= % 
l< 75 



6 

5.26 
18,75 
20.00 



I 4 

I 3.51 

I 12.50 

I 12.50 



175 <» % I 

I I 
— -+ 

I 22 I 

I 19.30 i 

I 68.75 I 

I 42.31 I 



17 
14.91 
30.36 
56.67 



I 16 

I 14.04 

I 28.57 

I 50.00 



7 

6.14 
26.92 
23.33 



I '2 

I 10.53 

I 46.15 

I 37.50 



30 
26.32 



32 
28.07 



23 I 
20.18 I 
41.07 I 
44.23 I 



7 I 

6.14 I 

26.92 i 

13.46 i 



52 
45.61 



TOTAL 

32 
28.07 



56 
49.12 



26 
22.81 



114 
100.00 



FREQUENCY MISSING - 2 



STATISTICS FOfl TABLE OF LOG USED BY RGRPWML 



STATISTIC 



DF 


VALUE 


PROB 


4 


12.845 


0.012 


4 


12.983 


0.011 


1 


2.032 


0.154 




0.336 




0.318 






0.237 





CHI-SQUARi; 

LIKELIHOOD RATIO CHj-P^'ARF 
HANTEL-HAENSZEL CHI- . Af^t 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



EFFECTIVE SAMPLE SIZE 
FREQUENCY MISSING « 2 



114 
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CONTENT REPRESENTATION: TECHNICAL APPEND I X 

TABLE 43 

TABLE OF LOG FREQ BY RKDNAPP 

LOG FREQ RHDNAPP(if2.S0ME STUDENTS. .NOT APPLICABLE) 

FREQUENCY I 
PERCENT I 
ROW POT ) 

COL PCT I YES I NO \ TOTAL 

82 
71.93 



32 
28.07 



FREQ <3 1 1 


57 


1 


25 


^ 1 


30.00 


f 


21.93 


1 


69.51 


1 


30. U9 


1 


66.28 


1 


89.29 


FREQ >= 1 1 
1 


29 
23.44 


1 
1 


3 

2.63 


1 


90.63 


1 


9.38 


1 


33.72 


1 


10.71 


TOTAL 


86 

75. 4U 




28 
24.56 



114 
100.00 



FREQUENCY MISSING = 2 



STATISTICS FOR TABLE OF LOG FREQ BY RHDNAPP 



STATISTIC 
CHI -SQUARE 

LIKELIHOOD RATIO CHI-SQUARE 
CONTINUITY ADJ. CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
FISHER'S EXACT TEST M-TAIL) 
(2-TAIL) 

PHI -0.220 
CONTINGENCY COEFFICIENT 0.215 
CRAMER'S V -0.220 



EFFECTIVE SAMPLE SIZE > 114 
FREQUENCY MISSING 2 



DF 


VALUE 


PROB 


1 


5.537 


0.019 


1 


6.339 


0.012 


1 


4.457 


0.035 


1 


5.489 


0.019 






0.014 






0.028 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 
- TABLE UU 

TABLE OF LOG FREQ BY RPGINOF 

LOG FREQ RPGIN0F(U5. PROGRESS.. STUDENT INDIFFERENCE) 

FREQUENCY | 
PERCENT I 
ROW PCT I 

COL PCT I IMPORTANISOMEWHATINOT IMPOI 

IT REASON I IMPORTAIRTmNT | TOTAL 

" ' •+ 

82 
71.93 



32 
26.07 



FHZQ <= 1 1 


1;!$ 1 


35 ! 


3 


1 


38.60 1 


30.70 1 


2.63 


1 


33.66 1 


U2.68 1 


3.66 


1 


77.19 1 


72.92 1 


33.33 


FREQ >= 1 1 


13 1 


13 1 


^ 6 


1 


11. UO 1 


11. UO 1 


5.26 


1 


40.63 1 


40.63 1 


18.75 


1 


2?-81 1 


27.08 1 


66.67 


TOTAL 


57 


U8 


9 




50.00 


U2.ll 


7.89 



IIU 
100.00 



FREQUENCY HISSING » 2 



STATISTICS FOR TABLE OF LOG FREQ BY RPGINDF 



STATISTIC 



DF 


VALUE 


PROB 


2 


7.UU5 


0.02U 


2 


6.604 


0,037 


1 


4 ,493 


0.034 




0.256 






0.2U8 






0.256 





CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



EFF?:CTIVE SAMPLE SIZE = IIU 
FREQUENCY MISSING = 2 



CONTENT REPRESeHTATlON: TECHNtCAl APPEflO'X 

TABL'f. U5 

T^BLS OF COMPLEX : REQ BY RPGABS 

COHPLOC FREQ R?GAB.$(47. PROGRESS. .STiiOENT ASSENTCEISM) 
FREQUCNCV ! 



PERCENT 
ROW PCT 
CX PCT 


IMPORTANtSOHEWHATlNOT IMPO 
T REASON 1 IMPORTAIRTANT \ 


TOTAL 


FREQ <« 1 


29 1 
24.17 1 
38.16 i 
61.70 1 


29 1 
24.17 1 
38.16 1 
78.38 1 


18 

15.00 
23. 6f 
50.0 J 


1 
1 
1 
1 


76 
63.33 


FREQ >= 1 


18 1 
15.. 00 1 

^o.9^ 1 

38.30 1 


8 1 
6.67 1 
18.18 1 
21.62 1 


•8 
15. JO 
40.91 
50 00 


1 
1 

1 
1 


44 
36.67 


TOTAL 


47 
39.17 


37 
30.83 


36 
30.00 




120 
100.00 



FREQUENCY HISSING = 1 



STATISTICS FOR TABLE OF COMPLEX FREQ BY RPGABS 



STATISTIC 


OF 


VALUE 


PR08 


CHI -SQUARE 


2 


6.416 


0.040 


LIKELIHOOD RATIO CHI -SQUARE 


2 


6.620 


0.037 


MANTEL-HAENSZEL CHI -SQUARE 


1 


0.847 


0.357 


PHI 




0.231 


CONTINGENCY COEFFICIENT 




0.225 




CRAMER'S V 




0.231 





EFFECTIVE SAMPLE SIZE = 12C 
FREQUENCY MISSING = 1 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 46 

TABLE OF LOG USED BY XRTCNATT 



LOG USED 

frequ^nc;y 

PERCENT 
HOW POT 
COL POT 



XRTCNATT - X STUDENTS NOT ATTENTIVE AND 
NOT BEHAVIORAL PROBLEMS 



INC STUDE|5<a 
I NTS I 



us::d <= 1 1 
1 
1 
1 


6 1 
3.36 1 
18.75 1 
14.29 1 


26 
23.21 
81.25 
37.14 


: 1 USED <» 2 ! 

1 
1 
1 


20 1 
17.86 1 
37.04 1 
47.62 1 


34 
30.36 
62.96 
48.57 


2 < USED 1 
1 
1 
1 


16 1 
14.29 1 
61.54 1 
38.10 1 


10 
8.93 
38.46 
14.29 


TOTAL 


42 
37.50 


70 
62.50 



TOTAL 

32 
28.57 



54 
48.21 



26 
23.21 



112 
100.00 



FREQUENCY MtSSSNG » 4 



STATISTICS FOR TABLE OF LOG USED BY XRTCNATT 



STATISTIC 


DF 


VALUE 


PROB 


CHI -SQUARE 


2 


11.215 


0.004 


LlKCLiHOOD RATIO CHI -SQUARE 


2 


11.470 


0.003 


MANTEL-HAENSZEL CHI -SQUARE 


1 


11.001 


0.001 


PHI 




0.316 




CONTINGENCY COEFFICIENT 




0.302 




CRAMER'S V 




0.316 





EFFECTIVE SAMPLE SIZE » 112 
FREQUENCY HISSING » 4 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 1*7 

TABLE OF LOG FREQ BY RECHNG 

LOG FREQ RECHNG (66. RATING.. CHANGE ACTIVITY IF NO ATTN) 



FREQUENCY 
PERCENT 
ROV PCT 
COL PCT 


lOF LITTL 
IE OR NO 


ISOME IHP 
lORTANCE 


IMAJOR IKIAMONG HI | 
IPORTANCEIGHEST | 


TCTAL 


FREQ <= 1 


1 9 
1 7.89 
1 10.98 
1 45.00 


1 27 
1 23.68 
1 32.93 
1 75.00 


1 33 
1 28.95 
1 40.24 
1 78.57 


13 1 
11.40 1 
15.85 1 
81.25 1 


82 
71.93 


FREQ >= 1 


1 11 
1 9.65 
I 34.38 
1 55.00 


1 9 
1 7.89 
1 28.13 
1 25.00 


1 9 1 
1 7.89 i 
1 28.13 ! 
1 21.43 i 


3 1 
2.63 1 
9.38 1 
18.75 1 


32 
28.07 


TOTAL 


20 
17.54 


36 
31.58 


42 
36.84 


16 
14.04 


114 
100.00 



FREQUENCE MISSING 



STATISTICS FOR TABLE OF LOG FREQ BY RECHNG 
STATISTIC 



CHI -SQUARE 

LIKELIHOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 



DF 


VALUE 


PROB 


3 


8.958 


0.030 


3 


8.243 


0.041 


1 


6.086 


0.014 




0.280 






0.270 






0.280 





EFFECTIVE SAMPLE SIZE = 114 
FREQUENCY MISSING » 2 



3^5 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 48 

TABLE OF LOG FREQ BY REFEED 



LOG FREQ 

FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



REFEED (74. RAT I NO.. FREQUENT INDIVIDUAL FEEDBACK) 



SOME IMPIMAJOR tMlAHONG Hl| 
ORTANCE IPORTANCElGHEST 

+ 



FREQ <» 1 


9 

7.89 
10.98 
69.23 


35 
30.70 
U2.68 
62.30 


38 
33.33 
U6.3U 
8U.UU 


FREQ >» 1 


4 1 21 
3.51 1 18. U2 
12.30 1 63.63 
30.77 1 37.30 


7 

6.1U 
21.88 
13.36 


TOTAL 13 56 U5 
11. UO U9.12 39. U7 



I TOTAL 

82 
71.93 



32 
28.07 



11U 
100.00 



FREQUENCY MISSING « 2 



STATISTICS FOR TABLE OF LOG FREQ BY REFEED 



STATISTIC 


DF 


VALUE 


PROB 


CHI -SQUARE 


2 


6.00U 


0.030 


LIKELIHOOD RATIO CHI -SQUARE 


2 


6.300 


0.0U3 


HANTEL-HAENSZEL CHI -SQUARE 


1 


3.38U 


0.038 


PHI 




0.229 




CONTINGENCY COEFFICIENT 




0.224 




CRAMER'S V 




0.229 





EFFECTIVE SAMPLE SIZE » 11U 
FREQUENCY MISSING > 2 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 49 

TABLE OF COMPLEX FREQ BY RESAYOD 



COMPLEX FREQ 



RESAYOD (89. RAT INC.. SAY SOMETHING GOOD ABOUT ANS) 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 


OF LITTL 
E OR NO 


SOME IMPI 
ORTANCE 


MAJOR IMIAMONG Hlj 
PORTANCE 1 GHEST j TOTAL 


FREQ <» 1 


7 

5.93 
1 9.33 
1 87.50 


26 j 
22.03 
34.67 
57.78 


35 1 7 

29.66 5.93 

46.67 j 9.33 
72.92 1 41.18 


75 
63.56 


FREQ >» 1 


1 

0*85 
2.33 
12.50 


19 
16.10 
44.19 
42.22 


13 j 10 
11.02 1 8.47 
30*23 1 23.26 
?7.08 1 58.82 


43 
36.44 

I- 


TOTAL 


8 

6.78 


45 
38.14 


48 17 
40.68 14.41 


118 
100.00 



FREQUENCY MISSING * 3 



STATISTICS FOR TABLE OF COMPLEX FREQ BY RESAYGO 



STATISTIC 


DF 


VALUE 


PROB 


CHI -SQUARE 

LIKEL!HOOD RATIO CHI -SQUARE 
MANTEL-HAENSZEL CHI -SQUARE 
PHI 

CONTINGENCY COEFFICIENT 
CRAMER'S V 


3 
3 
1 


8.121 
8.370 
1.398 
0.262 
0.254 
0.262 


0.044 
0.039 
0.237 



EFFECTIVE SAMPLE SIZE » 118 
FREQUENCY MISSING « 3 
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CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 50 

TABLE OF COMPLEX FREQ BY SDAYSYR 



COMPLEX FREQ 



SDAYSYR 



FREQUENCY 
PERCENT 
ROW PCT 
COL PCT . 



< 160 M80 



nBo< 



I TOTAL 



FREQ <« 1 


12 
10.53 
16.90 
41.38 


42 
36.64 
59.15 
66«6S 


17 
14.91 
23.94 
70.83 


71 
62.28 


FREQ >« 1 


. 17 
14.91 
39.53 
56.62 


19 
16.67 
44.19 
31.15 


7 1 43 
6.14 1 37.72 
16.28 1 
29.17 1 


TOTAL 


29 
25.44 


61 
53.51 


24 
21.05 


114 
100.00 


FREQUENCY MISSING > 


7 







STATISTICS FOR TABLE OF COMPLEX FREQ BY SDAYSYR 



STATISTIC 




DF 


VALUE 


PROB 


CH; -SQUARE 




2 


7.262 


0.026 


LIKELIHOOD RATIO CH 


-SQUARE 


2 


7.105 


0.029 


MANTEL-HAENSZEL CHh 


-SQUARE 


1 


7 190 


0.007 


PHI 






0.252 




CONTINGENCY COEFFIC 


ENT 




0.245 




CRAMER •S V 






0.252 





EFFECTIVE SAMPLE SIZE 
FREQUENCY MISSING « 7 



114 



CONTENT REPRESENTATION: TECHNICAL APPENDIX 

TABLE 7 1 

TABLE OF COMPLEX USED BY SN015 • OVERALL CURRICULUM 



COMPLEX USED 

FREQUENCY 
PERCENT 
ROW PCT 
COL PCT 



SN015 - OVERALL CURRICaUM 



ICOMPREHEICORE GEN 
INSIVE GE AND SPEC 
INERAL COURSES 
I 1 2 



USED 1 



I 8 1 ^ 

I 7.1U I 3.57 

I 25.00 I 12.50 

I 32.00 I 12.12 



STREAMINi 
G BY STU 
INTEREST! 
Ul 

^ — — <f 

20 I 
17. M I 
62.50 I 
S7.0U I 
4 



1 < USED <« 2 


k 1 5 
3.57 1 i^.W 
16.67 1 20.83 
16.00 1 15.15 


15 
13.39 
62.50 
27.78 


2 < USED 


13 1 2U 
11.61 1 21. U3 
23.21 1 U2.86 
1 52.00 1 72.73 


19 
16.96 
33.93 
35.19 



TOTAL 



25 
22.32 



33 
29.46 



5U 
U8.21 



TOTAL 

32 
28.57 



24 
21. U3 



56 
50.00 



112 
100.00 



FREQUENCY MISSING » 9 

STATISTICS FOR TABLE OF COMPLEX USED BY SN015 
STATISTIC OF VALUE PROB 



CHI-SQ15ARE 

LIKELIHCCO RATIO CHI -SQUARE 
MAHTEL-HAEh^2EL CHI -SQUARE 
PHi 

CONTINGENCY COEFFICIENT 
CRmMER^S V 



OF 


VALUE 


k 


12.3U9 


k 


12.968 


1 


U.585 




0.332 




0.315 




0.235 



0.015 
0.011 
0.032 



EFFECTIVE SAMPLE SIZE • 112 
FREQUENCY MISSING » 9 




?1 
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Appendix 2 
SIMS International and National Reports 



310 
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NATIONAL REPORTS ON SECOND INTERNATIONAL 
MATHEMATICS STUDY 

(NOVEMBER, 1989) 



CANADA (British Columbia) 

Robitaille, David F., J. Thonus OShea and Michael Dirks (1982) The Teaching and 
Learning of Mathematics in British Cohimbia. Victoria, BC: Ministry of Education, 
Learning Assessment Branch. 

Robitaille, D.F. (1985) An Analysis of Selected Achievement Data from the Second 
International Mathematics Study. Victorla,BC: Ministry of Education, Student 
Assessment Branch. 

CANADA (Ontario) 

McLean, L., D. Raphael, M. Wahlstrom (1986) Intentions and Attainments in the 
Teaching and Learning of Mathematics. Reports on the Second International 
Matitematics Study ' Ontario. Toronto: Ontario Ministry of Education. 

McLean, L, R. Wolfe, M. Wahlstrom (1987) Learning About Teaching from 
Comparative Studies: Onoirio Mathemadcs in International Perspective. Toronto,Ont: 
Ontario Miitistry of Education. 

Raphael, D., M. Wahlstrom, L McLean (198?) The Second International Study of 
Mathematics: An Overview of the Ontario Grade 8 Study. Toronto, Ont.: Ontario 
Institute for Studies in Education. 

Raphael, D., M. Wahlstrom, L. McLean (1983) The Second International Study of 
MaUiematics: An Overview of the Ontario Grade 12/13 Study. Toronto, Ont: Ontario 
Institute for Studies in Education. 

ENGLAND AND WALES 

Cresswell, M. J.Grubb (1987) The Second International Mathematics Study in England 
and Wales. International Studies in Pupil Performance. Windsor, Berks: NFER-Nelson. 

FINLAND 

Kangasniemi, E (1988) Opetussuunnitelma ja matematijkan Koulusaavutukset. 
(Curriculum and student achievment in mathematics). Research Reports: Pubiication 
series A Iitstttute for Educational Research, University of Jy vaskyla, Finland. 

FRANCE 

Robin, D, E. Barrier (1985). Enquete Internationale sur I'enseignement des 
mathematiques: Le cas francais (International Mathematics Study: The French Case) 



311 




296 



Tomel. INPR, Collection. National de Recherche PecUgogique. "Rapports de 
recherches" 8. Paris. 

HONG KONG 

Brimer, A., P. Grifan (1985) Mathematics Achievement in Hong Kong Secondary 
Schools. Center for Asian Studies, University of Hong Kong, Hong Kong. 

ISRAEL 

Padua, M, R. Raz, Eds. (1982) Cross-National Study in MaUiematics Grades 8, 12: The 
Research Instruments. Israel Curriculum Centar, Ministry of Education and Culture. 
Jerusalem. (In Hebrew.] 

LewyA (1983) Attitudes and Attainments in Mathematics: A Technical Report of a 
Survey in Gr?^es 8 and 12. Israel Curriculum Center, Ministry of Education and 
Culture. Jerusalem. [In Hebrew wiUi English summary.] 

Padua, H. (1983) Report on Mathematics Achievement Survey, Grades 8 and 12. Israel 
Curriculum Center, Ministiy of Education and Culture, Jerusalem. (In Hebrew with 
English summary.) 

Lewy,A. (1984) Mathematia achievement in Grade 12. L» A. M. Mayer and P. Tamir 
(Eds.), Sdence Teaching in Israel: 

Origins, Development and Achievements. The Amos De-Shalit 
Science Teaching Center, Jerusalem. (Hebrew witii English 
sun.mary.] - 



JAPAN 
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